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Green-sky thinking 


Environmental agencies must go much further in regulating aircraft emissions if they 


want to make a real difference. 


warming. A hardy bunch of committed worriers take the train 
instead, whereas others still celebrate the jet-set lifestyle as a sign 
of success. Then there are those who fly, but feel guilty about doing so. 

Aviation has become a symbol of the world’s reluctance to make 
serious efforts to tackle climate change — perhaps unfairly, given 
its relatively slight (although growing) contribution to the global- 
warming problem. On an individual level, those who travel by air 
leave gigantic carbon footprints, governments continue to invest in 
runways and airports, and the industry remains focused on growth. 

Most international frameworks to tackle carbon emissions struggle 
to include aviation. When the European Union tried to encompass 
emissions from international aviation in its emissions-trading scheme 
in 2012, it met with widespread protest from the industry and govern- 
ments. Instead, the International Civil Aviation Organization (ICAO) 
— the United Nations body that oversees the skies — agreed to come 
up with its own measures. 

The world saw the initial results of the ICAO’s work last week, when 
the organization proposed a new global carbon dioxide standard for 
aircraft (see page 266). It was hardly an inspiring achievement. The 
proposed regulation, which is expected to be adopted later this year, 
is complex, but the gist is that all new aircraft would need to meet 
minimum fuel-efficiency standards by 2028. The ICAO says that the 
rule will guarantee reductions in CO, emissions. This may be true, 
but it is misleading. 

An independent assessment by the International Council on Clean 
Transportation (ICCT) suggests that new aircraft would emit on aver- 
age 4% less CO, when the measure takes full effect. However, each 
generation of new aircraft is already made to be more fuel efficient than 
the last, and the same independent assessment highlights that aircraft 
manufacturers are likely to achieve an efficiency improvement of more 
than 10% by the time the new standard kicks in, effectively rendering 
the rule redundant. 

Still, the most notable thing about the global standard will be that 
it exists. It is both a precedent and a tool that could one day be used to 
push the industry further than it would go of its own accord. 

Individual countries could yet adopt stricter regulations. Last year, 
the US Environmental Protection Agency (EPA) issued an ‘endanger- 
ment finding for aviation emissions, which represents the first step 
ina regulatory process under the country’s Clean Air Act. The EPA is 
expected to finalize its finding in the coming months, and then it could 
launch its own regulatory proposal. The agency could, and should, go 
well beyond the ICAO standard on new aircraft, and introduce rules 
for existing aeroplanes. 

The EPA will not be able to complete this process before President 
Barack Obama leaves office, so it will be up to whoever is elected presi- 
dent in November to follow through. Given the general opposition by 
US conservatives to any kind of action on climate change, there is little 


A ttitudes towards flying say a lot about someone’ view on global 


hope of getting a strong regulation from a Republican administration. 
Moreover, whatever the EPA proposes will surely be challenged in 
the courts, which can be fickle and unpredictable, as evidenced by 
the Supreme Court's decision last week to block implementation of 
Obama's power-plant regulations pending the outcome ofa legal chal- 
lenge. But one thing is clear: the EPA must act on flights, and environ- 
mentalists will surely take the agency to court if it does not. 

Nor is the ICAO’s work done. The body 


“Now is a good will now address a plan to halt emissions 
time to invest in from international aviation at 2020 levels. 
amuch cleaner This is crucial because international aviations 
future.” already account for roughly 1.4% of global 


CO, emissions and are currently unregu- 
lated. Even the global climate agreement signed in December in Paris 
neglected to account for emissions from aviation or from international 
shipping, which is responsible for nearly 1.8% of the world’s CO, emis- 
sions (see page 275). 

Zero-emissions aircraft are not likely to be flying any time soon, so 
the key to the ICAO’s idea is the use of carbon offsets. It is probable 
that some kind of fee would be levied on international flights to pay 
for emissions reductions elsewhere. But there is scope to go further 
on cleaner aircraft too. 

Airlines are currently reaping profits thanks to the collapse of the 
oil market, which has lowered fuel prices across the board. Despite 
opposition from the aviation industry to strong emissions rules, now 
is a good time for it to invest in a much cleaner future. m 


Back to Earth 


Success against cancer need not deliver 
the Moon. 


hen John E. Kennedy pledged in a 1961 presidential speech 
W: land a man on the Moon and return him safely to Earth, 

he launched more than a space programme. He introduced 
the ultimate metaphor. Today, moonshots no longer need to shoot for 
the Moon. They can signify merely the launch ofa grand effort fuelled 
by bold ambition that will elevate society to some new heights. 

The latest is the US Cancer Moonshot, a US$1-billion plan, to be 
spearheaded by vice-president Joe Biden, that aims “to eliminate 
cancer as we know it”. 

The project and the promised investment are welcome indeed. 
The name and the rhetoric less so, and not just because they are so 
unoriginal — moonshots and Apollo programmes have been launched 
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in recent years on everything from renewable energy and neuroscience 
to an assortment of Google X pet projects and at least three efforts to 
fight cancer. 

Perhaps the United States was due for another national promise to 
cure cancer: the last — a 2005 pledge by Andrew von Eschenbach, then 
head of the US National Cancer Institute — was scheduled to have van- 
quished the disease by 2015. This followed then President Richard Nix- 
ons 1971 pledge to use $100 million to cure cancer. To be sure, pledges to 
cure cancer havea long history of succeeding in one respect: fundraising. 
But the idea that $1 billion could eliminate cancer is misleading, and 
only becomes more so as each passing year reveals more about the full 
challenge of fighting the disease. With apologies to Biden, the more we 
‘knowit’ the harder it becomes to think we can ‘eliminate cancer’. 

Today, we have a clearer view of cancer’s complexity. The sequenc- 
ing of tumour genomes has revealed heterogeneity not only among 
cancers and patients, but in a single tumour. Within those complex 
mixtures of cells can lurk mutations that give rise to drug resistance. 
Therapies against cancer-causing mutations have been transformative 
for some patients in the short term, but nearly always fail in the long 
term as resistant cells reseed the tumour. 

Real progress is being made, little by little. Chemotherapy cures 
more than 85% of children with acute lymphocytic leukaemia, for 
example. And for a small number of patients with various cancers, 
new immunotherapies have produced remissions so prolonged that 
doctors have begun to whisper the word ‘cure’ But although combina- 
tions of these therapies could hold the key to expanding their success, 
testing combinations in clinical trials is complex — and quite likely to 
cost more than $1 billion. As with everything, the more successful we 
get, the harder it is to improve. For many cancers, a more reasonable 
aim might be to turn them into chronic, manageable diseases. 

In statements and conversations, Biden has acknowledged this 
complexity, and has even reportedly expressed regret for choosing 
the moonshot theme. It is unfortunate that sound bites from Biden and 


the White House continue to back its simplistic framework. Repeated 
invocations of these bold but doomed quests to cure cancer in a dec- 
ade, or with a given sum of money, feed public cynicism about the 
value and potential of science. And setting an unreachable goal plays 
down the tremendous progress that cancer researchers have made. 
Details of the latest cancer moonshotare sketchy; Biden is still gather- 
ing input from the country’s scientific glitterati. From what we know, he 
hopes to double the pace of cancer research by 


“Setting an breaking down the barriers — logistical and 
unreachable cultural — that keep researchers from shar- 
goal plays down ing data. This can include having electronic 
the tremendous medical records that talk to one another, 
progress encouraging collaboration, and developing 
that cancer central repositories that can handle big data. 
researchers Some of these issues are already being tackled: 


the National Cancer Institute, for example, is 
putting together a large database that aims to 
unite disparate data sets, along with clinical information and the privacy 
concerns entailed, in one place. 

There is also no guarantee that Obama's $1-billion request will come 
to pass. Congressional leaders have pledged to ignore the president's 
budget request. And Obama sought to establish the funding for the 
cancer moonshot using an unusual approach that would circumvent 
the usual congressional funding process. Congress is unlikely to sign 
up to that. But there is hope that the programme will survive in some 
form: this Congress has a soft spot for medical research, and Biden is 
an authority on the art of congressional compromise. 

Let us hope it will. Biden's early vision of the programme, if exe- 
cuted well, has the potential to be high-impact. Cancer research is in 
the middle of a revolution, and may be on the brink of even greater 
success. The US Cancer Moonshot has the potential to build on this 
momentum. The project does not need to mislead the public, and 
damage its trust in science, in the process. = 


have made.” 


Chow down 


Scientists should pay more heed to the varying 
effects of diet and environment on animal work. 


experiment in animal nutrition. The mice they worked with could 
well have been the best fed in the history of research — not in terms 
of quantity of food, but in its quality. 

Ona typical day, one group of mice got to eat mixed rice with dried 
whitebait and green seaweed flakes for breakfast, together with cooked 
beans and miso soup containing the root vegetable taro and Japanese 
mustard spinach. Another group got bacon and eggs, toast and fluffy 
boiled potatoes. Lunch for one group could be simmered pumpkin and 
ground chicken, with a portion of cucumber and wakame seaweed with 
vinegar dressing. A different group of mice got a hamburger and salad. 

Dinner was selected by the scientists from dishes including prawns 
with chilli sauce, Sichuan-style bean curd, fried Japanese horse mack- 
erel, white radish and shimeji mushroom soup, sake-steamed clams and 
steamed Japanese seerfish. The mice ate from that kind of menu every 
week of their lives. There was no pudding, but probably no complaints. 

The reason for all this gourmet cuisine was to recreate the typical 
Japanese diet from decades past, and to examine its impact on health. 
The long life expectancy of people in Japan has been attributed to the 
benefits of Nihon shoku or traditional diet, involving fermented foods 
that seem to boost the protective effect of harmless microbes on and 
in the body. As the food available in Japan has become increasingly 
Westernized, the effects on health are being questioned. 


Jose scientists last year reported the results of an extraordinary 
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Hence the mouse study. Each group was given dishes from recreated 
daily menus for a typical Japanese household in 1960, 1975, 1990 and 
2005. The food was ground up and fed to the animals along with their 
regular meals of ‘standard laboratory chow. As the scientists suspected, 
the animals that were fed the older, more traditional diets lived and 
prospered for longest (K. Yamamoto et al. Nutrition 32, 122-128; 2016). 

There are two things to note. The first is the large contribution that 
the environment — in this case diet — can make to health. The second 
is that such experiments enable the impact of environment on health 
to be assessed in ways that are simply not possible for human trials. 

Light, heat, food, company, exercise, distractions, stress — all are 
at the fingertips of scientists who set up mouse experiments. Subtle 
changes in any of these can lead to profound, and potentially useful, 
discoveries about how health is changed by external factors. Research 
has probed, for example, how giving mice tunnels, stairs and wheels to 
play with can alter how female mice interact with their young, which 
in turn alters the brain development of the offspring (T. Begenisic 
et al Neurobiol. Dis. 82, 409-419; 2015). There is some evidence that 
modern, sterile, individually ventilated cages — used to minimize the 
effects of environmental factors such as disease — are quieter and less 
smelly for the mice, which reduces stimulation of those sensory systems. 

Given that we know that environment affects the outcome of 
experiments, it is surprising that we don't know more about the 
environmental set-up of other studies — those that test the impact of 
a potential medical treatment, for example. As we report in a News 
story on page 264, many researchers who use mice do not even know 
the content of standard lab chow and how it may 
change from study to study. As scientists hunt 
for sources of irreproducibility in their research, 
variation in living conditions — and how to 
reduce it — deserves more attention. = 


> NATURE.COM 

To comment online, 
click on Editorials at: 
go.nature.com/xhunqv 


© 2016 Macmillan Publishers Limited. All rights reserved 


WORLD VIEW jernisicos sen 


ing ona biomedical innovation bill. Australia’s main funding agency 

has just announced that it will cut hundreds of climate scientists as 
part of its National Science and Innovation Agenda. The National Coun- 
cil of Science Museums in India will add Innovation Hubs at its centres. 
More and more organizations are using innovation in their names and 
brands. Innovation is a central plank of national and local policies and it 
consumes billions of dollars of investment worldwide. Yet the evidence 
base for these innovation efforts is close to nothing. We simply do not 
know how the innovation happens. We should do more to find out. 

Innovation is commonly confused with invention and creativity. 
Creativity is the ability to generate original ideas, concepts and objects. 
It spurs invention, which is most evident in the areas of technology and 
business. Artists enjoy creativity, whereas engineers and scientists focus 
on inventions. But innovation demands a third 
ingredient: market success. 

History contains examples of the rapid transfor- 
mation of creativity into market success: Picasso 
managed to earn an income from his creations, 
and Disney’s theme parks are lucrative tourist 
attractions. But disruptive innovations — those 
that have transformative impacts, such as the 
steam engine or the Apple iPhone — are rare. 

The path to innovation is currently more art 
than science. That might explain why it is shock- 
ingly inefficient: the chance of an invention 
attaining enough commercial or social success to 
be recognized as an innovation reaches no more 
than low single percentages. In the United States’ 
Small Business Innovation Research programme, a very low proportion 
of grants results in a viable economic activity, product or service. In 
markets that are saturated, such as those of mobile phones or medical 
discoveries, the success rate is even lower. 

Governments want innovation that not only transforms the industry, 
but also offers solutions to the ‘grand challenge’ problems that the world 
faces: alternative energy sources, mitigating climate change, eliminating 
poverty and improving health care and security. Many research pro- 
grammes in these and other areas claim to be innovative, because they 
seek and apply new approaches to a specific problem. Others seem to 
believe that the research itself is innovative because it produces new 
findings, or that the results will inevitably lead to innovative outcomes. 

But it is not that simple. There is no deep understanding of the inno- 
vation process, which is complex and has not been well captured or 
formalized. There is no unified theory or reliable model for innovation. 
There is no innovation science. 


I nnovation is being talked about everywhere. The US Senate is work- 


How could the science of innovation happen? NATURE.COM 
There are several areas in which research could _ Discuss this article 
initiate and potentially formalize it: for example, _ online at: 


the study of patents or creative individuals such —_go.nature.com/mewfri 


THERE IS NO 


UNDERSTANDING 
OF THE 


INNOVATION 
PROCESS. 


Put innovation science at 
the heart of discovery 


The success rate of discoveries would be improved if we could find out how to 
innovate, argues Andrew Kusiak. 


as musicians and painters. We could use results from these studies to 
conceptualize and model the innovation process from generalizations 
identified across the engineering, arts, science and social domains. 

The task is complex and enormous. The first step could be to look for 
shared factors and to devise rules and hints that support innovation. It 
might help to look backwards. Firms and individuals often claim that 
they have learned from mistakes, but how many analyse failure system- 
atically? Patent libraries are packed with submissions that never get used, 
and many research programmes and clinical trials do not lead to suc- 
cess. Analysis of these failures could help others to succeed, and could 
contribute to an understanding of what drives innovation. 

Another backward-looking approach is imagining the best, then 
scaling it back to reality. Imagine an item of office furniture with all- 
encompassing functionality: it fulfils all the needs of the worker, but 
also changes colour according to the weather 
and adapts to the height and weight of the per- 
son. Limitations of technology, prohibitive cost 
and anticipated market response are then used to 
scale it back into marketable items: a chair whose 
height is adjustable and a desk that can be set at 
two height levels, and both would be available in 
different colours. Computer printers were inno- 
vated ina similar way by incorporating functions 
beyond printing. 

A building block of innovation science is con- 
necting seemingly unrelated ideas. We are flooded 
with discoveries in isolated domains. Making 
quick connections between, for instance, biology 
and technology, could lead to bigger ideas and 
redirect research and development. 

Innovation-science researchers must develop models of the market 
and products to predict successful outcomes. These models could be 
based on emerging evolutionary computation, and would be developed, 
validated and tested using streams of data, such as consumer interests 
and preferences. 

Over the long term, private foundations should establish a global 
initiative at a scale similar to that of the Breakthrough Energy Coalition 
spearheaded by Microsoft co-founder Bill Gates. (In fact, that coalition 
could itself greatly benefit from innovation science.) 

In commercial terms — comparing investment to output — the inno- 
vation process might have a failure rate of 99%. That would simply not be 
tolerated by any other commercial enterprise. Mainstream industry has 
moved to six-sigma programmes and beyond, barely tolerating one or 
two errors ina million. Governments and scientists should focus less on 
discussing various forms of innovation and more on how to innovate. m 


Andrew Kusiak is professor of mechanical and industrial engineering 
at the University of Iowa in Iowa City. 
e-mail: andrew-kusiak@uiowa.edu 
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Selections from the 
scientific literature 


RESEARCH HIGHLIGHTS 


Bacterial version 
of an eyeball 


A freshwater bacterium can 
sense the direction of light by 
acting as a microscopic lens. 
Spherical cyanobacteria 
called Synechocystis, which 
harvest energy from light, use 
protein appendages to pull 
themselves over wet surfaces 
towards light sources. A team 
led by Conrad Mullineaux 
at Queen Mary University of 
London found that the cells 
actas tiny lenses that bend and 
focus light. When the team 
illuminated one side of the cell, 
a bright spot appeared at the 
opposite end. Simulating this 
spot with a laser beam caused 
the Synechocystis cells to move 
away from the spot, towards the 
perceived source of light. 
Light-sensing proteins 
embedded in the cellular 
membrane trigger the bacteria 
to move towards light, the 
researchers suggest. 
eLife http://doi.org/bcgd (2016) 


Brain circuit 
for loneliness 


A neural circuit at the base 
of mouse brains drives a 
loneliness-like state and 
motivates the animals to seek 
company. 

Kay Tye at the 
Massachusetts Institute of 
Technology in Cambridge, 
Mark Ungless at the Medical 
Research Council's Clinical 
Sciences Centre in London 
and their colleagues found that 
connections between neurons 
in the circuit were stronger in 
mice that were separated from 
their cage mates than in those 
that were grouped together. 
Those neurons then fired 
more frequently when isolated 
mice were put in a cage with an 
unfamiliar mouse, compared 


PALAEONTOLOGY 


Termites had first castes 


The separate castes of social insects — queens, workers and 
soldiers — first appeared in termites at least 100 million years 
ago, according to newly discovered fossils. 

Although evolutionary studies have suggested that termites 
were first to evolve castes, there was very little fossil evidence 
for this. Now, Michael Engel at the University of Kansas in 
Lawrence, David Grimaldi at the American Museum of 
Natural History in New Yorkand their colleagues report 
the discovery of six termite species fossilized in amber from 
Myanmar, which show evidence of distinct castes. 

The 100-million-year-old fossils include two new species, 
which the authors call Krishnatermes yoddha, with queens, 
workers and soldiers (pictured), and Gigantotermes rex, 
whose 2-centimetre-long soldiers are among the largest ever 


reported. 


Curr. Biol. http://doi.org/bch6 (2016) 


with animals that had not 
been isolated. When the 
scientists inhibited the 
neurons with light, the 
isolated mice showed less 
interest in the stranger. 
Activating those neurons 
caused the animals to actively 
seek other mice. 

The circuit was more 
responsive in socially 
dominant animals. 

Cell 164, 617-631 (2016) 
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Exploding bubbles 
kill cancer cells 


A technology using tiny 
exploding bubbles inside 
tumours could one day help 
to mop up cancer cells during 
surgery. 

Surgeons lack the tools to 
detect microtumours during 
cancer surgery, increasing 


the risk that cancer will come 
back. To address this, Dmitri 
Lapotko, now at the medical- 
technology firm Masimo 

in Irvine, California, and 

his colleagues injected gold 
nanoparticles into tumour- 
bearing mice before surgery. 
The particles, which have 
cancer-specific antibodies on 
their surfaces, were taken up by 
cancer cells. After removing the 
main tumour, the researchers 
heated up the nanoparticles 
using a short laser pulse, 
causing nanobubbles to form 
and explode only in the cancer 
cells, destroying them without 
harming normal tissue. 

The explosions generated a 
detectable acoustic signal. 

The team spotted small 
numbers of these cells during 
surgery that would otherwise 
have gone unnoticed. After the 
surgery, no tumours regrew in 
any animals in which residual 
cancer was removed, whereas 
more than 80% of the mice 
that had standard surgery died 
of recurring tumours. 

Nature Nanotech. http://dx.doi. 
org/10.1038/nnano.2015.343 
(2016) 


Power from water 
and graphene 


Chemists have generated 
electricity from water by 
passing it through a material 
containing atom-thick sheets 
of carbon. 

Liangti Qu at the Beijing 
Institute of Technology and 
his colleagues developed 
a 3D structure made from 
graphene oxide that had holes 
big enough to let moisture pass 
through them freely. The water 
molecules reacted with oxygen- 
containing groups in the 
graphene oxide, dissociating 
to form hydrogen ions. The 
oxygen groups were distributed 
unevenly in the material, with 
more at the bottom than the 


CURR. BIOL. 
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top, resulting in a large-enough 
flow ofions to generate electric 
power. 

The material was 
sandwiched between two 
aluminium electrodes studded 
with holes to let moisture pass 
through. The resulting power 
illuminated a light-emitting 
diode lightbulb. 

Energy Environ. Sci. http://doi. 
org/beg2 (2016) 


Polymers bolster 
printed tissue 


A 3D printer can generate 
tissues using cells and 
polymers that are larger and 
more robust than previously 
printed biological structures. 
Bioprinted 3D organs could 
one day help people who 
need transplants, but existing 
methods tend to produce 
only small, simple structures. 
Larger ones lose their shape, 
or die because nutrients 
cannot reach their centres. 
To build larger, stronger 
tissues, Anthony Atala and 
his colleagues at Wake Forest 
Institute for Regenerative 
Medicine in Winston-Salem, 
North Carolina, devised 
a 3D-printing system 
that adds biodegradable 
polymers for structural 
support. By combining 
polymer-based frames with 
hydrogels containing cells, the 
researchers printed a human- 
sized ear (pictured), a human 
jawbone fragment, a segment 
of mouse muscle and a piece 
of rat skull. Microchannels 
printed in the structures 
helped nutrients to flow into 
the tissues. 


The team implanted some of 
the structures into rodents and 
found that the tissues survived 
over weeks and months. 

Nature Biotech. http://dx.doi. 
org/10.1038/nbt.3413 (2016) 


Stored water 
slows rising seas 


Changes in water storage on 
land may have slowed sea-level 
rise during the past decade. 

John Reager of NASA's Jet 
Propulsion Laboratory in 
Pasadena, California, and his 
team investigated the shifting 
volumes of water stored on 
land using global data from 
NASAs Gravity Recovery 
and Climate Experiment 
(GRACE) satellite, which 
calculates water and ice mass 
on the basis of changes in 
Earth’s gravity field. They 
found that between 2002 
and 2014, 3,200 gigatons 
more water than expected 
was stored on land as snow, 
soil moisture, surface water 
and groundwater, thanks to 
climate-driven changes in 
hydrology. This offset sea- 
level rise caused by melting 
glaciers and ice sheets by 
about 20% over the same 
period. 

These results show that 
climate-driven land water 
storage is significant enough to 
be included in future estimates 
of sea-level rise, the authors say. 
Science 351, 699-703 (2016) 


Metabolism varies 
within tumours 


Human lung-tumour cells 
break down sugars in different 
ways in different patients and 
even in the same tumour. 
Cells in the same tumour 
are known to vary genetically. 
To study tumour metabolism, 
Ralph DeBerardinis at 
the University of Texas 
Southwestern Medical Center 
in Dallas and his colleagues 
infused a harmless carbon 
isotope into nine people who 
had lung cancer, and combined 
clinical-imaging techniques 
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Popular topics 
on social media 


Do female programmers face bias? 


Female software developers see their contributions of code 
accepted more frequently by the open-source software 
repository GitHub than do men, according to a preprint 
that attracted much attention last week on social media. 
But this happened only when the contributor’s gender 
was not obvious from their GitHub profile page. When 
gender was made clear, the acceptance rate for women 
fell to slightly less than that for men, say researchers who 
analysed data on the activity of more than 1.4 million users 
of GitHub. Morgan Ernest, an ecologist at the University 
of Florida in Gainesville, tweeted a link to the paper: “Well 
that’s horrifying. Is that suggested programming change by 
a woman? Reject. Don't know it’s a woman? Accept.” But 
other researchers questioned whether 


> NATURE.COM 
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with mass spectrometry to 
track how the carbon was 
biochemically processed by 
the tumours. Cancer cells are 
thought to feed on glucose and 
release a by-product called 
lactate, but the team found that 
some tumour cells consumed. 
both lactate and glucose. 
Blood-flow patterns showed 
that different parts of a single 
tumour had varying metabolic 
patterns. 

Understanding these 
patterns could help to improve 
metabolic therapy for cancer, 
the authors suggest. 

Cell 164, 681-694 (2016) 


Cockroaches 
inspire robot 


Researchers have discovered 
how cockroaches can speed 
through gaps just millimetres 
high — and have used 
their findings to builda 
compressible robot. 

Kaushik Jayaram and 
Robert Full at the University of 


the study, posted at Peer] PrePrints on 
9 February, definitively demonstrates 
gender bias. 

Preprint at http://doi.org/bchz (2016) 


sar 


California, Berkeley, observed 
American cockroaches 
(Periplaneta americana) as 
they squeezed through a series 
of crevices that decreased 
in height. The insects could 
maintain speeds of up to 
60 centimetres per second after 
entering these tight areas, and 
only slowed when the ceiling 
height reached 4 millimetres 
— about one-third of the 
insects’ free-standing height. 
They achieved their speed. 
by using their legs and feet to 
push against friction between 
their bodies, the ceiling and the 
ground. 

The cockroaches’ mode 
of locomotion inspired the 
development of a soft-bodied 
robot (pictured left) that can 
compress its height by half 
(pictured right), allowing it to 
move through a tight space. 
Proc. Natl Acad. Sci. USA 
http://doi.org/bch5 (2016) 
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Obama’s budget 

US President Barack Obama's 
fiscal year 2017 budget plan, 
released on 9 February, calls 
for a 4% bump in research 
and development funding. 
But science advocates and 
lawmakers are unhappy 

with Obama’ decision to 
boost science by relying on 
‘mandatory spending rather 
than on the ‘discretionary’ 
funding stipulated by 
Congress, which decides how 
much money each agency 
will receive. Obama has 
requested $8 billion for the 
National Science Foundation, 
up 6.7% over 2016’s estimate. 
The request for the National 
Institutes of Health is 

$33.1 billion, up 2.6%, and 
NASA’ budget request is 

$19 billion, down $300 million. 
See go.nature.com/hr9j6d 

for more. 


} RESEARCH 
Resistance rising 


Bacteria in humans and 
animals across Europe 
continue to show significant 
resistance to common 
antimicrobial treatments. The 
European Centre for Disease 


NUMBERCRUNCH | 


1,188 


The number of species of 
living sharks, skates, rays 
and chimaeras, according 
toa list collated at the 
University of Hamburg, 
Germany. The list includes 
509 species of shark, 

630 skate and ray species 
and 49 species of chimaera 
fish, a cartilaginous species. 
Source: S. Weigmann J. Fish Biol. 
http://doi.org/bej4; 2016. 


Gravitational waves ring out 


Delighted physicists at the Advanced 

Laser Interferometer Gravitational-Wave 
Observatory (LIGO) revealed on 11 February 
that they had heard the gravitational ‘ringing’ 
produced by the collision of two black holes 
about 400 million parsecs (1.3 billion light- 
years) from Earth. The black holes, about 


Prevention and Control and 
the European Food Standards 
Agency said in a joint report 
on 11 February that “high 

to extremely high” levels of 
resistance to multiple drugs 
were seen in some types of 
Salmonella in 2014. Resistance 
that would “significantly” 
reduce the effective treatments 
available was also seen in 
some Campylobacter types. 
Resistance to colistin — a last- 
resort antibiotic treatment — 
in Salmonella and Escherichia 
coli was found in chickens. 


Research standards 


A global network of science 
academies has called for 
coherent international 
research standards. The 
InterAcademy Partnership 
issued Doing Global Science 
(see go.nature.com/3zpmwsg), 
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a 160-page guide to 
responsible science practices, 
on 11 February. The report 
says that researchers have 

an obligation to share data, 
conduct peer review, disclose 
conflicts of interest, keep clear 
records, and discuss possible 
harmful consequences of work 
when they plan projects. It also 
says that institutions should 
provide clear guides and 
strong training for responsible 
research. 


Asian genomes 

An effort to sequence the 
genomes of 100,000 people 
from Asia was announced on 
11 February. The GenomeAsia 
100K project, hosted by 
Nanyang Technological 
University in Singapore, aims 
to characterize the genomes 
of individuals from an initial 


36 and 29 solar masses, merged into a single, 
gravitational sink in space-time weighing 

62 solar masses. The merger temporarily 
radiated more energy — in the form of 
gravitational waves — than all the stars in the 
observable Universe emitted as light in the same 
amount of time. See page 261 for more. 


list of 19 countries across the 
continent. The project will also 
gather medical information 
and microbiome samples. 

The project consortium 

says that the biology of the 
world’s Asian population 

is under-represented in 
scientific research, and 

hopes that its results will 
accelerate precision-medicine 
applications for Asian patients. 


Renewables rise 


Energy from renewable 
sources reached 16% in 2014 
in the European Union, up 
from 15% in 2013, according 
to 10 February figures from 
Eurostat, the EU’s statistical 
office. The EU as a whole has a 
target to get 20% of its energy 
from renewables by 2020, 


CHRIS MADDALONI/NATURE 
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with individual countries 
having different burdens. In 
2014, Finland had the biggest 
increase in renewables, 
increasing by 2 percentage 
points, whereas Bulgaria 
slipped by 1 percentage 
point — although both 
countries remain on track to 
make their targets by 2020. 


Italy GM reprimand 


The University of Naples in 
Italy formally reprimanded 
11 scientists, including 
laboratory chief Federico 
Infascelli, on 10 February, after 
an investigation committee 
concluded that three research 
papers co-authored by them 
contained manipulated 

data. The papers’ two 
corresponding authors are 
also forbidden from 
publishing for two years 
without approval from their 
department director. The 
papers claimed that animals 
were harmed by eating 
genetically modified feed. 
Police in the Veneto region 
used Infascelli’s work to justify 
destruction of a genetically 
modified crop last year; 

the farmer is now seeking 
compensation. 


Pachauri protests 
Rajendra Pachauri (pictured) 
was appointed executive 
vice-chairman of Delhi-based 
think tank The Energy and 


TREND WATCH 


An online poll answered by more 
than 3,600 Nature readers suggests 


that some 10% have waited at 


least 3 years for one or more of 
their papers to be published ina 
journal (see Nature 530, 148-151; 
2016). But more than one-third 
had never waited longer than a 
year. Readers were also asked what 
they thought was the best way 

to speed up publication. More 


than 40% suggested that peer 


reviewers should stop asking for 
unnecessary revisions, and 22% 
asked journal editors to make 


quicker, clearer decisions. 


Resources Institute (TERI) on 
8 February. Last year, Pachauri 
resigned as chair of the 
Intergovernmental Panel on 
Climate Change after claims 
that he had sexually harassed 
anow-former TERI employee. 
He was also replaced as 

TERI director-general. On 

10 February, a second former 
employee accused Pachauri of 
sexual harassment. Alumni of 
TERI University have started 
a petition in protest against 
Pachauri’s appointment, 

and the scientist has taken a 
voluntary leave of absence. 


KI head resigns 


Anders Hamsten, the 
vice-chancellor of the 
Karolinska Institute (KI) 

in Stockholm, has resigned 
after acknowledging that he 
mishandled an investigation 
into surgeon Paolo 
Macchiarini. The KI will 
also reopen its misconduct 
inquiry into Macchiarini, it 
announced on 13 February. 
In an open letter in the 


THE WAITING GAME 


Swedish newspaper Dagens 
Nyheter, Hamsten wrote 

that he had “completely 
misjudged” Macchiarini, 
who has been accused of 
research misconduct and 
ethical breaches in his work 
implanting artificial windpipes 
into patients. Last August, 
Hamsten cleared Macchiarini 
after an external investigation 
found that the surgeon had 
committed misconduct in 
seven research papers. 

See go.nature.com/7bpza6 
for more. 


Gas leak plugged 


A huge gas leak at a storage 
facility of the Southern 
California Gas Company has 
been “temporarily controlled”, 
the firm says. A 13 February 
assessment by California's Air 
Resources Board suggests that 
94,000 tonnes of natural gas 
have escaped during the leak 
in Aliso Canyon, which was 
first detected on 23 October, 
and declared an emergency by 
the state governor in January. 
Inan15 February statement, 
Southern California Gas 
Company said that it was 

now cementing the leak to 
permanently seal the well. 


LISA masses freed 


Less than a week after the 
first detection of gravitational 
waves was announced (see 
page 261), a European 


Around 10% of Nature’s readers say that their longest wait to get a 
paper published in a journal has been 3 or more years. 


T% 3% 


3,644 


RESPONSES 


8% 

I Less than 6 months 
M6 months-1 year 
{) 1-2 years 

i 2-3 years 

i 3-5 years 


{)) More than 5 years 


Poll question: What is the longest time that you have waited for 


a research paper to be published? 


SEVEN DAYS | THIS WEEK | 


21-26 FEBRUARY 
New Orleans in 
Louisiana plays host to 
the 2016 Ocean Sciences 
Meeting. 
osm.agu.org/2016 


22-28 FEBRUARY 
The 4th session of 
UNESCO’s Plenary of 
the Intergovernmental 
Science-Policy Platform 
on Biodiversity and 
Ecosystem Services 
(IPBES) convenes in 
Kuala Lumpur. 
go.nature.com/p43aaz 


Space Agency (ESA) probe 
has demonstrated a crucial 
technology for future space- 
based detectors of the waves. 
On 16 February ESA said 
that its LISA Pathfinder 
mission performed one of 
its most delicate stages — to 
suspend two small solid test 
cubes freely in a vacuum 
enclosure inside the probe. 
The spacecraft launched 

on 3 December and on 

23 February will begin its 
science mission by bouncing 
a laser beam between the two 
cubes in freefall. 


Philae’s long sleep 


The European Space Agency 
(ESA) has given up hope 

that their comet-probing 
lander, Philae, will ever work 
again. On 12 February, the 
space agency announced that 
Philae, the Rosetta spacecraft’s 
lander, was entering “eternal 
hibernation” Philae landed 

on comet 67P/Churyumov- 
Gerasimenko in November 
2014 and operated for 

64 hours before going into 
hibernation. The craft last sent 
a signal to scientists in July 
last year. ESA scientists are no 
longer sending signals to the 
lander, and say the chances of 
hearing from it again are “close 
to zero”. 


> NATURE.COM 
For daily news updates see: 
WWwW.nature.com/news 
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The pair of merging black holes that LIGO detected using gravitational waves — as produced by a computer simulation. 


GRAVITATIONAL WAVES 


LIGO’s path to victory 


Historic discovery of ripples in space-time meant ruling out the possibility of a fake signal. 


BY DAVIDE CASTELVECCHI 


t 11:53 a.m. local time on 14 September 
A 2015, an automated e-mail appeared in 

the inbox of Marco Drago, a physicist 
at the Max Planck Institute for Gravitational 
Physics in Hannover, Germany. It contained 
links to two plots, each showing a wave shaped 
like a bird’s chirp that emerged suddenly from 
a noisy background and ended in a crash. 

It was a signal that Drago had been trained to 
spot and that the US-led Advanced Laser Inter- 
ferometer Gravitational- Wave Observatory 
(LIGO) that he works on was built to detect: 
the signature ripples in space-time produced 
when two black holes collide to form a single 
gravitational sink. No one had ever directly 
detected gravitational waves before, nor a 
black-hole merger. The plots, one from each of 


LIGO’s twin detectors in Washington state and 
Louisiana, would go on to make history. 

On 11 February, the LIGO collaboration 
announced that it had made the first detec- 
tion of gravitational waves from a black-hole 
merger that occurred about 400 million par- 
secs (1.3 billion light years) from Earth. It 
was just over 100 years after Albert Einstein 
predicted such waves as part of his general 
theory of relativity. “We did it!” David Reitze, 
the executive director of the LIGO Laboratory, 
said at a press conference in Washington DC. 

As well as being expected to lead to a Nobel 

prize, LIGO’s discov- 
ery launches the field 


For more on of gravitational-wave 
gravitationalwaves astronomy, in which 
and LIGO see: scientists will ‘listen’ to 


the waves to learn more 


about the Universe (see page 263). 

On that September morning, Drago could 
not take it for granted that he was looking at 
the chirp of a black-hole merger. “It was clear 
that it was something extraordinary,’ he says. 
But the plots were also something that the 
LIGO researchers had expected to see injected 
artificially by their colleagues to test the detec- 
tors. “I went down to the office of my colleague 
Andrew Lundgren to ask him if he was aware 
of an injection,’ says Drago. 

Lundgren quickly checked the data logs and 
found no traces of a drill. Next, Drago sent an 
e-mail to the entire LIGO collaboration — 
1,000 researchers spread around the world 
— to see what they thought. 

“When I first saw it, I said ‘Oh, it’s an injec- 
tion, obviously,” says physicist Bruce Allen, 
Drago and Lundgren’s boss. Allen, who was 
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IN FOCUS 


LIGO’S GROWING UNIVERSE 


An upgrade to the Laser Interferometer Gravitational-Wave Observatory (LIGO) that 
was completed last year increased the volume of space that the detector could scan. 
This greatly improved its chances of detecting gravitational waves — a historic feat 
that LIGO announced on 11 February. Further planned improvements mean that 
LIGO's observable universe is set to increase again soon. 


O 


ORIGINAL LIGO 
2002-10 
Range*: 20 million parsecs 
(64 million light years) 
Results: No gravitational 
waves detected 


Capricornus 
supercluster 


Sculptor, 
supercluster. 


Perseus-Pisces 


: supercluster 
Phoenix 


supercluster. 


ADVANCED LIGO (first run) 
Sept 2015 to Jan 2016 
Range: 60 million parsecs 
(190 million light years) 
Results: Detected first 
gravitational waves 


ADVANCED LIGO (future) 


Range: 200 million parsecs 
(640 million light years) 


Ophiuchus 
supercluster, 


Shapley, 
supercluster, 


Ursa Major 
supercluster 


Leo 
superclusters 


31 million parsecs 
(100 million light years) 


*Radius assuming that source of gravitational waves is merger of two neutron stars; 
the radius changes depending on the source of the gravitational waves. 


in a meeting at the time, did not bother to 
enquire until after his lunch break. 

Within a few hours, collaborators on the 
other side of the Atlantic woke up to Dragos 
e-mail — including experimental physicist 
Rainer Weiss at the Massachusetts Institute 
of Technology (MIT) in Cambridge, who is 
credited as the chief inventor of LIGO. “When 
I started looking at these waveforms, they were 
something spectacular,’ he says. 

To many, the timing of the signal seemed too 
good to be true: the collaboration had com- 
pleted a five-year upgrade to its instruments 
(see ‘LIGO’s growing universe’). Moreover, 
the LIGO collaboration had also given a small 
number of its members the power to inject fake 
signals and to hide whether they were real or 
simulated in order to test the team’s responses. 
But even such a ‘blind injection ought to have 
left some traces in the data, says LIGO spokes- 
woman Gabriela Gonzalez, a physicist at 
Louisiana State University in Baton Rouge. 

After a long day of calls and e-mails, she 
determined that no blind injection had 
occurred and told the entire collaboration. 


Only then did Kip Thorne, a theoretical 
physicist at the California Institute of Technol- 
ogy (Caltech) in Pasadena who co-founded 
LIGO with Weiss and Caltech colleague Ron- 
ald Drever, realize that a 40-year-old dream 
had come true. But it was not yet time to pop 
the champagne. The collaboration needed 
to do more before announcing a discovery to 
the world. “That night at home, I celebrated by 
just smiling to myself, 
because I could not 
tell my wife yet,” 

Thorne says. 

Gonzalez and 
her team decided to 
take data for another 
month before beginning a full analysis: the 
researchers needed to record the natural noise 
present in their detectors to have something 
to compare with the chirp. They concluded 
that the odds of noise producing that loud 
pattern — and the very same pattern in both 
Louisiana and Washington at about the same 
time — were so low that it should only occur 
by chance less than once every 203,000 years. 
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To extract as much information as possible, 
the researchers then performed lengthy super- 
computer simulations, Allen says. These con- 
firmed that the data beautifully matched the 
predictions of Einstein’s general theory of rela- 
tivity in 1915, and the theoretical work that in 
the past few decades has led physicists to under- 
stand the theory’s implications in fine detail. 

From the waveforms, the researchers were 
able to deduce that one black hole was about 
36 times the mass of the Sun, and the other 
was about 29 solar masses. As the two objects 
orbited each other, they warped the fabric of ti 
space and time around them in a fluctuating =| 
pattern. Those fluctuations then travelled 
across the Universe as gravitational waves for 
an estimated 1.3 billion years, stretching and 
squeezing space as they moved. 

LIGO’s twin interferometers bounce laser 
beams between mirrors at the opposite ends 
of perpendicular, 4-kilometre-long vacuum 
pipes. A gravitational wave passing through 
will alter the length of the pipes in different 
ways, causing the laser beams to shift slightly 
out of sync. By the time the waves from the 
black-hole merger arrived on 14 September, 
they had become tiny ripples, changing the 
length of the pipes on the order of just 1 part 
in 1 billion trillion (10”). 

Although the two black holes had probably 
been orbiting each other for millions of years, 
LIGO began to pick up their waves only when 
they reached a frequency of 35 cycles per sec- 
ond (hertz). The frequency rapidly increased 
to 250 hertz. The signal became chaotic and 
then rapidly died down; the whole thing was 
over within a quarter of a second. Crucially, 
both detectors saw it at roughly the same 
time — Livingston, in Louisiana, first and 
Hanford, in Washington, 7 milliseconds later. 
The delay is an indication of how the waves 
swept through Earth. 

Then came writing the paper. This involved 
getting 1,000 researchers to agree on every 
detail, and took some 5,000 e-mails, says 
LIGO’s chief detector scientist Peter Fritschel 
at MIT. On 21 January, the team submitted 
the paper, which Physical Review Letters pub- 
lished on 11 February (B. P. Abbott et al. Phys. 
Rev. Lett. 116, 061102; 2016), the same day that 
LIGO held multiple press conferences around 
the world. 

“These amazing observations are the confir- 
mation of a lot of theoretical work, including 
Einstein's general theory of relativity, which pre- 
dicts gravitational waves,’ says physicist Stephen 
Hawking at the University of Cambridge, UK. 

LIGO’s triumph is a fitting end to the tale 
that Einstein began. He never believed that 
black holes existed. But although astronomers 
had accumulated compelling evidence for black 
holes by observing their surroundings, notes 
Thibault Damour, a theoretical physicist at the 
Institute of Advanced Scientific Studies near 
Paris, the LIGO signal is “the first real direct 
proof of their existence” m 


MAP ADAPTED FROM ANDREW Z. COLVIN/CC-BY-SA 


ASTROPHYSICS 


IN FOCUS | NEWS 


Young scientists poised to ride 
the gravitational wave 


Detection of ripples in space-time kicks off new era in physics. 


BY ALEXANDRA WITZE 


: ‘ first direct detection of 
gravitational waves has opened 
a new window in physics and 
astronomy — rewarding a cohort of 
young researchers who gambled on 
finding evidence of a phenomenon that 
had long eluded physicists. 

Conceived in the 1970s and built 
in the 1990s, the Laser Interferom- 
eter Gravitational-Wave Observatory 
(LIGO) has been promising results for 
decades. On 11 February, it finally deliv- 
ered, when project scientists reported 
finding signals of the space-time ripples 
known as gravitational waves. 

Its observations are now poised 
to reshape ideas about high-gravity 
environments — such as colliding black 
holes, exploding stars and the earliest 
moments of the Universe. 

“The big game-changer for me is, we 
really do have data and we can finally 
test our theories,” says Samaya Nissanke, 
an astrophysicist at Radboud University in 
Nijmegen, the Netherlands. 

Nissanke is among the legions of early- 
career researchers who got into gravitational 
physics hoping to use data from LIGO and 
similar detectors. “You have this unique probe 
into extreme gravity and extreme space-time 
in a way that really holds your imagination,’ 
she says. “I was hooked” 

LIGO’s first phase ran for years without 
detecting any gravitational waves, but after a 
major upgrade in September last year, it took 
just days to find a signal. That strengthens the 
belief that it will catch many future waves. 
Physicists expect each additional discovery to 
bring fresh insight. 

“The fact that this happened right away has 
really given us a boost,” says Laura Cadonati, 
a physicist at the Georgia Institute of Tech- 
nology in Atlanta who oversees LIGO data 
analysis. But the field did not look nearly so 
rosy 15 years ago, when Vicky Kalogera, an 
astrophysicist at Northwestern University in 
Evanston, Illinois, was calculating how often 
astrophysical objects such as black holes or 
neutron stars — the ultra-dense leftovers of 
exploded stars — merge. Such collisions are 
thought to be the source of most of the gravita- 
tional waves that LIGO was designed to detect. 


Astrophysicist Mansi Kasliwal hopes to use gravitational-wave 
signals to study colliding neutron stars. 


Kalogera led some of the early calculations 
to explore how often two neutron stars might 
collide close enough to Earth for LIGO to spot 
the ensuing gravitational waves (V. Kalogera 
et al. Astrophys. J. 601, L179-L182; 2004). 
Estimates from different groups varied widely, 
and included some pessimistic scenarios in 
which LIGO had little chance of ever catching 
any waves. Kalogera faced a tough decision: 
whether to stick with gravitational-wave astro- 
physics or switch to topics that might be more 
likely to yield actual data. 

“I went with my guts when everybody 
told me it was the wrong career choice,’ she 
says. “Now, it is stunning to actually be in the 
detection era.” 

Surprisingly, LIGO’s first detection did not 
come from a binary neutron-star system — 
which are thought to be relatively common, 
with six known pairs in our Galaxy alone — 
but from two large black holes. Both were of 
the order of 30 times the mass of the Sun. “You 
can start to think of these not just as gravita- 
tional-wave sources,’ says Nissanke. “They are 
real astronomical beasts.” 

Still, many physicists hold out hope that 
LIGO and similar detectors will soon catch 
gravitational waves from merging neutron stars. 
These incredibly dense stars are impenetrable to 
ordinary astronomical telescopes, which cannot 


probe beneath their blazingly bright 
surfaces; researchers must rely on mod- 
els to extrapolate what is going on inside. 

Gravitational waves could change 
that, yielding information such as the 
precise sizes of neutron stars and how 
neutrons pack themselves together 
so tightly. These answers would come 
from the details of how the neutron 
stars spiral towards one another in the 
last moments before their final merger. 

“There’s this potential to learn about 
the densest stable matter in the Uni- 
verse, in a way that we've been blind to 
before,” says Jocelyn Read, a physicist at 
California State University, Fullerton. 

When neutron stars merge, they are 
thought to fuse light chemical elements 
into heavier ones, which they then spew 
into the surrounding environment. 
Such cosmic collisions are the source 
of many of the heavy metals in the cos- 
mos, including much of the gold that has 
ever been created, says Mansi Kasliwal, 
an astrophysicist at the California Institute of 
Technology in Pasadena. 

“We haven't actually seen explosions that are 
powerful enough to synthesize these elements,” 
she says. But when LIGO detects gravitational 
waves, astronomers will be able to command 
their telescopes to sweep the part of sky where 
the waves come from — and, with any luck, will 
capture a flash of these gold mines in the sky. 

Kasliwal is already searching with a wide- 
field camera on a 1.2-metre telescope at Palo- 
mar Observatory in California. Next year, this 
effort will upgrade to a much bigger camera 
that can survey the sky 12 times faster. A simi- 
lar survey in Chile is also expected to come 
online next year, giving both the Northern 
and Southern hemispheres a dedicated tel- 
escope for following the tantalizing traces of 
gravitational-wave detections. 

Astronomers hope eventually to piece 
together a more complete picture of how grav- 
itational waves and conventional astronomy 
come together. It’s like seeing a film in which 
the combination of images (electromagnetic 
waves) and sound (gravitational waves) pro- 
vides a much fuller picture than either could 
alone, says Alessandra Corsi, an astrophysicist 
at Texas Tech University in Lubbock. “It feels 
incredibly exciting to be right at the start of a 
new era.’ # 
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Mice are sensitive to minor changes in food, bedding and light exposure. 


A mouse’s house 
may ruin studies 


Environmental factors lie behind many irreproducible 


rodent experiments. 
BY SARA REARDON 


promising in mice rarely work in people. 

But, too often, experimental treatments 
that succeed in one mouse population do not 
even work in other mice, suggesting that many 
rodent studies are flawed from the start. 

“We say mice are simpler, but I think the 
problem is deeper than that,” says Caroline 
Zeiss, a veterinary neuropathologist at Yale Uni- 
versity in New Haven, Connecticut. Research- 
ers rarely report on subtle environmental factors 
such as their rodents’ food, bedding or exposure 
to light; as a result, conditions vary widely across 
labs despite research showing that these factors 
can significantly affect the animals’ biology. 

“It’s sort of surprising how many people are 
surprised by the extent of the variation” between 
mice that receive different care, says Cory 
Brayton, a pathologist at Johns Hopkins Uni- 
versity in Baltimore, Maryland. Ata meeting on 
mouse models at the Wellcome Genome Cam- 
pus in Hinxton, UK, on 9-11 February, she and 
others explored the many biological factors that 
prevent mouse studies from being reproduced. 

Christopher Colwell, a neuroscientist at the 


IE no secret that therapies that look 


University of California, Los Angeles, has first- 
hand experience with these issues. He and a col- 
league studied autism in the same genetically 
modified mouse line, but got different results 
on the same behavioural tests. Eventually, they 
worked out why: Colwell, who studies circadian 
rhythms, keeps his mice dark in the daytime 
to trick their body clocks into thinking day is 
night, so that the nocturnal animals are alert 
when tested in the day. His colleague does not. 
Nutrition can also determine whether a 
mouse study succeeds or fails. Some mouse 
foods contain oestrogens and endocrine- 
disrupting chemicals that can affect research 
on cancer, among other diseases (M. Nygaard 
Jensen and M. Ritskes-Hoitinga Lab. Anim. 41, 
1-18; 2007). And the high-fat, high-sugar food 
used in obesity studies goes rancid quickly; 
when it does, mice may stop eating and lose 
weight without researchers realizing why. 
Food choices can also alter a mouse’s gut 
microbiome. Catherine Hagan Gillespie, a vet- 
erinary pathologist at the Jackson Laboratory 
in Sacramento, California, has found that spe- 
cies of bacteria in the gut vary widely between 
mice from different vendors (A. C. Ericsson 
et al. PLoS ONE 10, e0116704; 2015). In an 
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unpublished study, she also found that mice 
with different gut bacteria showed different 
anxiety levels in behavioural tests. 

But few behavioural scientists think about 
microbiology assessments, says Hagan Gillespie. 
Even when they do, the extra work can increase 
the complexity and cost of the study. Yet the 
mouse microbiome is sensitive to many factors, 
such as air quality and maternal stress. 

Differences in the gut microbiome may 
explain why mice with the same genes can have 
different characteristics, or phenotypes, says 
George Weinstock, associate director for micro- 
bial genomics at the Jackson Lab’s site in Farm- 
ington, Connecticut. The lab, which breeds 
and supplies mice for use in studies around the 
world, tightly controls factors such as the type 
and quantity of food and the pH of water that 
animals receive. Even so, it finds differences 
between mice at its three sites. Weinstock says 
that the company has begun looking into stand- 
ardizing its customers’ experiments by provid- 
ing food and care instructions for its mice. 

But even when improved mice and food are 
available, some researchers resist using them in 
case it affects their results, says Graham Tobin, 
former technical director of mouse-diet vendor 
Tekladin Alconbury, UK. He argues that stand- 
ardizing results is worth any inconvenience, 
and notes that researchers rarely resist adopt- 
ing other new technologies that can throw older 
data into question. 

Zeiss says that the competitive nature of sci- 
ence might increase researchers’ resistance to 
changing how they consider animals in research 
design. If scientists have to treat their animals at 
the right point in the experiment, analyse both 
clinical and biomarker changes, include old 
mice and both sexes to ensure that results are 
representative of broad populations, and control 
environmental variables, each experiment will 
take much longer and the scientists are probably 
not going to be able to publish as much, she says. 

The US National Institutes of Health (NIH) 
has taken steps to address some of these prob- 
lems, although some people say it is not enough. 
Some NIH institutes require certain animal tri- 
als to be replicated before a therapy can move 
into clinical trials, but the agency says that it 
has no plans to require this agency-wide. And 
in 2014, the NIH began requiring researchers 
to include female animals in studies, and giving 
out supplementary grants to those who com- 
plained about the cost. But the agency has not 
issued any specific grants or supplements to 
study other confounding factors. 

That is disappointing to those who would 
like to see researchers control — or at least 
report — factors such as the strain of mice 
used and the type of environment they are 
raised in. This would allow meta-analyses 
of published literature that could identify 
any confounding factors. “The information 
and the wisdom is out there,’ says Zeiss, “but 
studies get funded without necessarily a lot of 
attention to that.” m SEE EDITORIALP.254 
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‘Hug a preprint, biologists!’ 


ASAPbio meeting discusses the ins and outs of posting work online before peer review. 


BY EWEN CALLAWAY AND KENDALL POWELL 


hysicists do it; computer scientists, 
Poestiemtcin and economists do it. 

And this week, a who's who of biomedi- 
cal researchers and publishers is asking what 
it will take to convince life scientists to do it, 
too — release their work online before peer 
review and formal journal publication. 

The impetus for the gathering, called 
ASAPbio (asapbio.org), is the growing 
frustration of some researchers at the slow pace 
of publishing in biology journals (see Nature 
530, 148-151; 2016). The delay can take years, 
notes Ron Vale, a cell biologist at the University 
of California, San Francisco. That can seriously 
affect scientists’ careers because they don't 
receive recognition for their work until it is 
published. 

The solution, argues Vale, a co-organizer 
of ASAPbio, is for biologists to embrace pre- 
prints: pre-publication manuscripts posted 
online. These speed up dissemination, give stu- 
dents and postdocs tangible ways to cite their 
contributions to the literature, and stimulate 
discussion and ideas, he says — accelerating 
and improving life-sciences research. 

There are signs that some biologists are 
ready to follow the lead of their colleagues 
in the physical sciences, where it is now 
routine for research to be submitted to the 
arXiv preprint server — founded 25 years 
ago — before publication. A life-sciences- 
only preprint server called bioRxiv started 
in 2013 and is rapidly growing in popularity 
(see “The growth of bioRxiv’), especially in data- 
intensive fields such as computational biology 
and genomics. 

It has now seen more than 3,100 posted 
preprints, says John Inglis, the site’s co-founder 
and the executive editor of Cold Spring Harbor 
Laboratory Press in New York. Other journals, 
such as the online F1000Research, also encour- 
age the posting of life-sciences manuscripts 
before peer review. 
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But preprints are still unfamiliar ground 
for biologists, Vale says. Leslie Vosshall, a 
neurobiologist at the Rockefeller University 
in New York City, says that if such sites are to 
become popular in the life sciences, researchers 
will have to overcome common concerns — for 
example, that preprints could lead to scientists 
being scooped by competitors and missing out 
on credit for ideas. Vale and Vosshall say that 
such worries are misplaced. “I think most biol- 
ogists don’t know about preprints, or if they do, 
they've heard of them ata very superficial level, 
to the point that they don't really understand 
them very well, Vale says. 


THE GROWTH OF BIORXIV 


More than 3,100 preprints have been posted 
to the biology preprint server bioRxiv since its 
launch in 2013. 
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“There’s no doubt that preprints are 
happening,” says Harold Varmus, a cancer 
biologist at Weill Cornell Medical College 
in New York City and another co-organizer 
of ASAPbio, held on 16-17 February at the 
Howard Hughes Medical Institute in Chevy 
Chase, Maryland. “But I don't think we've ever 
had a conversation among all the constituents 
about what the effects will be” 

Both Vale and Vosshall think that pre- 
prints will become widely accepted only if the 
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life-sciences community develops a consensus 
that preprint publication establishes a priority 
for any discovery. A discussion about that 
is at the top of ASAPbio’s agenda, and Vale 
co-authored an article on it, posted to the 
conference’s website last week. He has also 
tasked meeting attendees with considering 
how funding agencies and academic com- 
mittees should view preprints when deciding 
whom to fund and hire. 

Another concern is that quality might dip 
if life scientists flood preprint servers with 
non-peer-reviewed work. But supporters 
of preprint publication say that, if anything, 
researchers are more careful when their repu- 
tation rides on early work made public for all 
to critique. 

The issue of whether a preprint could 
jeopardize the chances of a manuscript 
subsequently appearing in a peer-reviewed 
journal is also being resolved, says Inglis. Since 
bioRxiv launched, several journal publishers 
have changed their policies to expressly allow 
the publication of research previously posted 
to preprint servers. 

Some scientists would like to see more- 
radical changes. Many make new data sets 
and hypotheses instantly and freely available 
online at repositories such as GitHub, figshare 
and Zenodo, and hope for crowdsourced 
peer review of their work. “That’s my utopian 
fantasy. It would be amazing to live in a 
world with all radically free data,” says 
Jessica Polka, a postdoctoral fellow at Harvard 
Medical School in Boston, Massachusetts, 
and co-organizer of the ASAPbio meeting. 
She says that preprints are “the most practical 
of all the transformative things that could be 
implemented”. 

All of Vosshall’s preprint articles have also 
been published in conventional journals 
“through the excruciatingly slow process of 
peer review’, she notes. “Most of them don't 
look any different. Which begs the question, 
why do we need journals any more?” m 
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| NEWS IN FOCUS 
UN sets out 
emissions plan 
for planes 


Standards aim to reduce CO, 
produced by new aircraft. 


BY JEFF TOLLEFSON 


United Nations panel has proposed the 
A™ global greenhouse-gas emissions 
standard for aircraft. 

The draft rule, released by the International 
Civil Aviation Organization (ICAO) on 8 Feb- 
ruary, would apply to most new commercial 
and business aircraft, including designs already 
in production. But it would require only mini- 
mal changes to aviation technology over the 
next 12 years, and many environmentalists say 
that the proposal is inadequate. 

The ICAO standard would take full effect in 
2028; the panel is expected to adopt the plan 
later this year. Doing so could cut the amount 
of fuel used at cruising speed by an average of 
4% compared with current levels, according to 
the International Council on Clean Transpor- 
tation (ICCT), a non-profit research group in 
Washington DC. 

But many environmental groups found the 
proposal wanting. “We think that this is just 
woefully insufficient, says Vera Pardee, a lawyer 
with the Center for Biological Diversity in Oak- 
land, California. She notes that the plan would 
not apply to aircraft that are already flying. 

Daniel Rutherford, programme director for 
marine and aviation at the ICCT in San Fran- 
cisco, California, agrees that the ICAO could 
have been more aggressive. An ICCT study 
released in December found that manufac- 
turers could reduce fuel consumption in new 
planes by 25% in 2024 and by 40% in 2034 by 
improving engine technologies and aerody- 
namics, and reducing aircraft weight. 

Nonetheless, Rutherford says, the ICAO’s 
plan is a step forward. “These standards do 
tend to matter over time as you update them 
and make them more stringent,” he adds. 

The ICAO process aims to plug a gap in 
the UN climate agreement signed in Paris 
last December. The panel also is working on 
a market-based offset mechanism that would 
levy a fee on international flights. 

In the meantime, individual countries might 
implement more-stringent standards for aircraft 
emissions, and environmentalists are gearing up 
for a fight. For instance, lawsuits from environ- 
mental groups helped to push the US Environ- 
mental Protection Agency to begin developing 
its own greenhouse-gas standards for aircraft. m 
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Pollution in New Delhi fell by 10% when vehicle numbers on its roads were temporarily reduced. 


ATMOSPHERIC SCIENCE 


Car ban yields 
science bounty 


Scramble by researchers to monitor driving restrictions in 


Indian capital pays off. 


BY MEERA SUBRAMANIAN 


ew Delhi may be the world’s most 

| \ | polluted city, but it’s making an effort 

to relinquish that title. With pollu- 

tion from particulate matter at potentially 

lethal levels early last December, city officials 

took a drastic step: they announced that they 

would temporarily restrict the use of private 

vehicles by allowing owners to drive only on 

alternate days, based on the sequence of their 
number plates. 

The initial results of that 15-day trial, 
which began on 1 January, are now in. 
Although traffic actually increased in the first 
week of the ban, the levels of PM, , — parti- 
culate matter measuring less than 2.5 micro- 
metres across — fell by roughly 10%. That 
is a victory not just for New Delhi offi- 
cials, but also for the scientists who sprang 
into action to collect the data necessary to 
determine whether the test had achieved 
its goal. 

“This experiment with ‘live researcl’ has 
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been really quite exciting,” says Santosh 
Harish, assistant director of the India centre 
of the Energy Policy Institute at the Univer- 
sity of Chicago (EPIC-India). EPIC-India 
and the New Delhi-based Council on Energy, 
Environment and Water (CEEW), an inde- 
pendent think tank, used video monitors 
around the city to document the types and 
numbers of vehicles on the roads. The groups 
had less than a month to collect baseline data 
before the driving restrictions began. 

But they weren't the only researchers 
interested in Delhi’s living lab. Econo- 
mist Gabriel Kreindler of the Massachu- 
setts Institute of Technology in Cambridge 
scrambled to secure human-study approval 
and funding for a survey of driver behav- 
iour during the traffic restrictions. Within 
18 days of the announcement of the driv- 
ing ban, he had arrived in New Delhi to 
oversee a surveying team from the Abdul 
Latif Jameel Poverty Action Lab’s office 
there. Kreindler’s work eventually found 
that the alternate-day restrictions were 
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well received by most drivers, who, in spite 
of the disruption, were willing to comply 
and alter their behaviour for short periods 
of time. 

Other researchers built on work already 
under way. The Centre for Science and 
Environment (CSE), a non-profit research 
and advocacy group in New Delhi, had been 
closely analysing government air-quality data 
since last October. By December, govern- 
ment monitors were recording daily levels of 
noxious PM, , in the range of 400-600 micro- 
grams per cubic metre. This is much higher 
than the Indian legal standard of 60 micro- 
grams (which itself is more than double the 
25-microgram target threshold set by the 
World Health Organization). 

PM, particles cause more than 600,000 
premature deaths in India each year, from 
lung cancer, asthma, and cardiovascular and 
respiratory diseases. There is no known safe 
level for this pernicious pollutant. 

The CSE’s analysis found that, despite 
unfavourable weather conditions, the peak 
pollution during the driving scheme was 
lower than it would have been without 
the restrictions in place. 

“The region is geographically disadvan- 
taged,” says M. P. George, a scientist with the 
government's Delhi Pollution Control Com- 
mittee. In winter, particulate levels can be 


twice as high as during the summer, because 
‘inversion layers’ of warm air trap cold air close 
to the ground. This prevents pollution from dis- 
sipating into the atmosphere. Emissions from 

vehicles and con- 


“This exper iment struction dust also 
with ‘live combine with raised 
research’ has levels of black carbon 
beenreally quite _ generated from win- 


ter sources — fires 
for warmth, brick 
kilns that are lit in the autumn, and widespread 
field burning in neighbouring states. 

“It’s a very simple math,” says Sarath 
Guttikunda, director of the independent 
research group Urban Emissions, which 
is registered in New Delhi. “In winter, 
your air volume is going down and your 
emissions are going up.” 

Because atmospheric conditions such as 
wind and temperature can greatly affect partic- 
ulate-matter measurements, researchers from 
EPIC-India and the Evidence for Policy Design 
initiative at Harvard University in Cambridge, 
Massachusetts, gathered data from air-quality 
monitors in New Delhi and placed monitors in 
three adjacent cities as a control. They found 
that the daily level of PM, ; pollution in Delhi 
dropped by 10-13% during the vehicle restric- 
tions. Hourly comparisons showed an even 
greater improvement, at times an 18% fall. 


exciting.” 
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The question now is whether New Delhi, the 
capital ofa nation with dozens of growing cities 
choked by pollution, can build on the experi- 
ment for long-term gains in air quality. “Delhi 
has to get it right,” says Namit Arora, amember 
of the pollution task force of the Delhi Dialogue 
Commission, a government initiative. This will 
require long-term strategies and coordination 
between local, regional and national efforts, he 
says, as well as a reduction in all sources of air 
pollution. Other researchers stress the need for 
more open-access data from a wide range of 
well-calibrated instruments. 

But the driving-restriction experiment has 
given researchers a tantalizing glimpse of one 
possible future. “We need to re-imagine the 
way we think about cities,” says Hem Himanshu 
Dholakia, a research associate at the CEEW. 
“That's the real opportunity.” m 


CORRECTIONS 

The Editorial ‘Blue future’ (Nature 529, 
255-256; 2016) should have said that 
2.4-4.6% of the world’s carbon emissions 
are captured and sequestered by living 
organisms in the oceans. And the power of 
ASTRO-H probe is 3,500 watts not 2,500 
as stated in the News story ‘High stakes 
for Japan’s space probe’ (Nature 530, 
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WHAT SPARKED THE CAMBRIAN EXPLOSION? 


An evolutionary burst 540 million years ago filled the seas with an 
astonishing diversity of animals. The trigger for that revolution is 
finally coming into focus. 


series of dark, craggy pinnacles rises 
A 80 metres above the grassy plains 

of Namibia. The peaks call to mind 
something ancient — the burial mounds of 
past civilizations or the tips of vast pyramids 
buried by the ages. 

The stone formations are indeed monu- 
ments of a faded empire, but not from anything 
hewn by human hands. They are pinnacle reefs, 
built by cyanobacteria on the shallow sea floor 
543 million years ago, during what is known 
as the Ediacaran period. The ancient world 
occupied by these reefs was truly alien. The 


BY DOUGLAS FOX 


oceans held so little oxygen that modern fish 
would have quickly foundered and died there. 
A gooey mat of microbes covered the sea floor, 
and on that blanket lived a variety of enigmatic 
animals whose bodies resembled thin, quilted 
pillows. Most were stationary, but a few mean- 
dered blindly over the slime, grazing on the 
microbes. Animal life was simple, and there 
were no predators. But an evolutionary storm 
would soon upend this quiet world. 

Within several million years, this simple eco- 
system would disappear, and give way toa world 
ruled by highly mobile animals that sported 
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modern anatomical features. The Cambrian 
explosion, as it is called, produced arthropods 
with legs and compound eyes, worms with 
feathery gills and swift predators that could 
crush prey in tooth-rimmed jaws. Biologists 
have argued for decades over what ignited this 
evolutionary burst. Some think that a steep rise 
in oxygen levels sparked the change, whereas 
others say that it sprang from the development 
of some key evolutionary innovation, such as 
vision. The precise cause has remained elusive, 
in part because so little is known about the 
physical and chemical environment at that time. 


JOHN SIBBICK/NATURAL HISTORY MUSEUM 


But over the past 
several years, discov- 
eries have begun to 
yield some tantalizing 
clues about the end of 
the Ediacaran. Evi- 
dence gathered from the Namibian reefs and 
other sites suggests that earlier theories were 
too simplistic — that the Cambrian explosion 
actually emerged out of a complex interplay 
between small environmental changes that trig- 
gered major evolutionary developments. 

Some scientists now think that a small, 
perhaps temporary, increase in oxygen sud- 
denly crossed an ecological threshold, enabling 
the emergence of predators. The rise of carn- 
ivory would have set off an evolutionary arms 
race that led to the burst of complex body types 
and behaviours that fill the oceans today. “This 
is the most significant event in Earth evolution,” 
says Guy Narbonne, a palaeobiologist at Queen's 
University in Kingston, Canada. “The advent of 
pervasive carnivory, made possible by oxygena- 
tion, is likely to have been a major trigger” 


The Cambrian seas 
teemed with new types 
of animal, such as the 
predator Anomalocaris 
(centre). 


ENERGY TO BURN 

In the modern world, it’s easy to forget that 
complex animals are relative newcomers to 
Earth. After life first emerged more than 3 bil- 
lion years ago, single-celled organisms domi- 
nated the planet for most of its history. Thriving 
in environments that lacked oxygen, they relied 
on compounds such as carbon dioxide, sulfur- 
containing molecules or iron minerals that act 
as oxidizing agents to break down food. Much 
of Earth’s microbial biosphere still survives on 
these anaerobic pathways. 

Animals, however, depend on oxygen — a 
much richer way to make a living. The process 
of metabolizing food in the presence of oxygen 
releases much more energy than most anaero- 
bic pathways. Animals rely on this potent, con- 
trolled combustion to drive such energy-hungry 
innovations as muscles, nervous systems and 
the tools of defence and carnivory — mineral- 
ized shells, exoskeletons and teeth. 

Given the importance of oxygen for animals, 
researchers suspected that a sudden increase in 
the amount of the gas in the oceans, to near- 
modern levels, could have spurred the Cam- 
brian explosion. To test that idea, they have 
studied ancient ocean sediments laid down 
during the Ediacaran and Cambrian periods, 
which together ran from about 635 million to 
485 million years ago. 

In Namibia, China and other spots around 
the world, researchers have collected rocks 
that were once ancient sea beds, and analysed 
the amounts of iron, molybdenum and other 
metals in them. The metals’ solubility depends 
strongly on the amount of oxygen present, so 
the quantity and type of those metals in ancient 
sedimentary rocks reflect how much oxygen 
was in the water, when the sediments formed. 

These proxies seemed to indicate that oxy- 
gen concentrations in the oceans rose in several 


steps, approaching today’s sea-surface concen- 
trations at the start of the Cambrian, around 541 
million years ago — just before more-modern 
animals suddenly appeared and diversified. This 
supported the idea of oxygen as a key trigger for 
the evolutionary explosion. 

But last year, a major study' of ancient 
sea-floor sediments challenged that view. 
Erik Sperling, a palaeontologist at Stanford 
University in California, compiled a database 
of 4,700 iron measurements taken from rocks 
around the world, spanning the Ediacaran and 
Cambrian periods. He and his colleagues did 
not find a statistically significant increase in the 
proportion of oxic to anoxic water at the bound- 
ary between the Ediacaran and the Cambrian. 

“Any oxygenation event must have been far, 
far smaller than what people normally consid- 
ered,’ concludes Sperling. Most people assume 
“that the oxygenation event essentially raised 
oxygen to essentially modern-day levels. And 
that probably wasn't the case’, he says. 

The latest results come at a time when sci- 
entists are already reconsidering what was 
happening to ocean oxygen levels during this 
crucial period. Donald Canfield, a geobiol- 


“This is the most 
significant event in 
Earth evolution.” 


ogist at the University of Southern Denmark 
in Odense, doubts that oxygen was a limiting 
factor for early animals. In a study published last 
month’, he and his colleagues suggest that oxy- 
gen levels were already high enough to support 
simple animals, such as sponges, hundreds of 
millions of years before they actually appeared. 
Cambrian animals would have needed more 
oxygen than early sponges, concedes Canfield. 
“But you don't need an increase in oxygen 
across the Ediacaran-Cambrian boundary,’ he 
says; oxygen could already have been abundant 
enough “for along, long time before”. 

“The role of oxygen in the origins of animals 
has been heavily debated,’ says Timothy Lyons, 
a geobiologist at the University of California, 
Riverside. “In fact, it’s never been more debated 
than it is now.’ Lyons sees a role for oxygen in 
evolutionary changes, but his own work’ with 
molybdenum and other trace metals sug- 
gests that the increases in oxygen just before 
the Cambrian were mostly temporary peaks 
that lasted a few million years and gradually 
stepped up (see ‘When life sped up’). 


MODERN MIRRORS 

Sperling has looked for insight into Ediacaran 
oceans by studying oxygen-depleted regions 
in modern seas worldwide. He suggests that 
biologists have conventionally taken the wrong 
approach to thinking about how oxygen shaped 
animal evolution. By pooling previously pub- 
lished data with some of his own and analysing 
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them, he found that tiny worms survive in areas 
of the sea floor where oxygen levels are very low 
— less than 0.5% of average global sea-surface 
concentrations. Food webs in these oxygen- 
poor environments are simple, and animals feed 
directly on microbes. In places where sea-floor 
oxygen levels are a bit higher — about 0.5-3% 
of concentrations at the sea surface — animals 
are more abundant but their food webs remain 
limited: the animals still feed on microbes rather 
than on each other. But around somewhere 
between 3% and 10%, predators emerge and 
start to consume other animals’. 

The implications of this finding for evolution 
are profound, Sperling says. The modest oxygen 
rise that he thinks may have occurred just before 
the Cambrian would have been enough to trig- 
ger a big change. “If oxygen levels were 3% and 
they rose past that 10% threshold, that would 
have had a huge influence on early animal evo- 
lution,” he says. “There's just so much in animal 
ecology, lifestyle and body size that seems to 
change so dramatically through those levels.” 

The gradual emergence of predators, driven 
by a small rise in oxygen, would have meant 
trouble for Ediacaran animals that lacked obvi- 
ous defences. “You're looking at soft-bodied, 
mostly immobile forms that probably lived 
their lives by absorbing nutrients through their 
skin,” says Narbonne. 

Studies of the ancient Namibian reefs suggest 
that animals were indeed starting to fall prey to 
predators by the end of the Ediacaran. When 
palaeobiologist Rachel Wood from the Uni- 
versity of Edinburgh, UK, examined the rock 
formations, she found spots where a primitive 
animal called Cloudina had taken over parts of 
the microbial reef. Rather than spreading out 
over the ocean floor, these cone-shaped crea- 
tures lived in crowded colonies, which hid their 
vulnerable body parts from predators — an eco- 
logical dynamic that occurs in modern reefs”. 

Cloudina were among the earliest animals 
known to have grown hard, mineralized exo- 
skeletons. But they were not alone. Two other 
types of animal in those reefs also had min- 
eralized parts, which suggests that multiple, 
unrelated groups evolved skeletal shells around 
the same time. “Skeletons are quite costly to 
produce,’ says Wood. “It’s very difficult to come 
up with a reason other than defence for why an 
animal should bother to create a skeleton for 
itself” Wood thinks that the skeletons provided 
protection against newly evolved predators. 
Some Cloudina fossils from that period even 
have holes in their sides, which scientists inter- 
pret as the marks of attackers that bored into the 
creatures’ shells®. 

Palaeontologists have found other hints that 
animals had begun to eat each other by the late 
Ediacaran. In Namibia, Australia and New- 
foundland in Canada, some sea-floor sediments 
have preserved an unusual type of tunnel made 
by an unknown, wormlike creature’. Called 
Treptichnus burrows, these warrens branch 
again and again, as if a predator just below the 
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When life sped up 


Big animals emerged during the Ediacaran period, but these creatures were slow or immobile. A rise in oceanic 
oxygen concentrations at the end of the period might have helped to trigger the Cambrian evolutionary explosion. 


epoch might have 
ended with a 
see temporary spike in 
oxygen. Other spikes 
could have happened 
at different times. 


800 million years 
ago (Myr): Oxygen 
levels rose from 
less than 0.1% to 
. perhaps 1-2%. 
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The Cambrian explosion produced many of 
the animal types common today, such as 
arthropods (Marrella and the Anomalocaris) 
and chordates (Pikaia), a group that now 
includes vertebrates. 


The Ediacaran animals were 
relatively simple and lacked 
evidence of legs, eyes and many 
other anatomical innovations 


microbial mat had systematically probed for 
prey animals on top. The Treptichnus burrows 
resemble those of modern priapulid, or ‘penis; 
worms — voracious predators that hunt in a 
remarkably similar way on modern sea floors®. 

The rise of predation at this time put large, 
sedentary Ediacaran animals at a big disadvan- 
tage. “Sitting around doing nothing becomes a 
liability,” says Narbonne. 


THE WORLD IN 3D 

The moment of transition from the Ediacaran 
to the Cambrian world is recorded in a series 
of stone outcrops rounded by ancient glaciers 
on the south edge of Newfoundland. Below that 
boundary are impressions left by quilted Edi- 
acaran animals, the last such fossils recorded on 
Earth. And just 1.2 metres above them, the grey 
siltstone holds trails of scratch marks, thought to 
have been made by animals with exoskeletons, 
walking on jointed legs — the earliest evidence 
of arthropods in Earth’s history. 

No one knows how much time passed in that 
intervening rock — maybe as little as a few cen- 
turies or millennia, says Narbonne. But during 
that short span, the soft-bodied, stationary Edi- 
acaran fauna suddenly disappeared, driven to 
extinction by predators, he suggests. 

Narbonne has closely studied the few fauna 
that survived this transition, and his findings 
suggest that some of them had acquired new, 
more complex types of behaviour. The best clues 
come from traces left by peaceful, wormlike 
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animals that grazed on the microbial mat. Early 
trails from about 555 million years ago meander 
and criss-cross haphazardly, indicating a poorly 
developed nervous system that was unable to 
sense or react to other grazers nearby — let 
alone predators. But at the end of the Ediacaran 
and into the early Cambrian, the trails become 
more sophisticated: creatures carved tighter 
turns and ploughed closely spaced, parallel lines 
through the sediments. In some cases, a curvy 
feeding trail abruptly transitions into a straight 
line, which Narbonne interprets as potential evi- 
dence of the grazer evading a predator’. 

This change in grazing style may have 
contributed to the fragmentation of the micro- 
bial mat, which began early in the Cambrian. 
And the transformation of the sea floor, says 
Narbonne, “may have been the most profound 
change in the history of life on Earth”"”"". The 
mat had previously covered the sea bed like a 
coating of plastic wrap, leaving the underlying 
sediments largely anoxic and off limits to ani- 
mals. Because animals could not burrow deeply 
in the Ediacaran, he says, “the mat meant that 
life was two-dimensional”. When grazing capa- 
bilities improved, animals penetrated the mat 
and made the sediments habitable for the first 
time, which opened up a 3D world. 

Tracks from the early Cambrian show that 
animals started to burrow several centimetres 
into the sediments beneath the mat, which 
provided access to previously untapped nutri- 
ents — as well as a refuge from predators. It’s 
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also possible that animals went in the opposite 
direction. Sperling says that the need to avoid 
predators (and pursue prey) may have driven 
animals into the water column above the sea 
bed, where enhanced oxygen levels enabled 
them to expend energy through swimming. 

The emerging evidence about oxygen 
thresholds and ecology could also shed light 
on another major evolutionary question: when 
did animals originate? The first undisputed 
fossils of animals appear only 580 million 
years ago, but genetic evidence indicates that 
basic animal groups originated as far back as 
700 million or 800 million years ago. Accord- 
ing to Lyons, the solution may be that oxygen 
levels rose to perhaps 2% or 3% of modern lev- 
els around 800 million years ago. These con- 
centrations could have sustained small, simple 
animals, just as they do today in the ocean's 
oxygen-poor zones. But animals with large 
bodies could not have evolved until oxygen 
levels climbed higher, in the Ediacaran. 

Understanding how oxygen influenced the 
appearance of complex animals will require 
scientists to tease subtler clues out of the rocks. 
“We've been challenging people working on 
fossils to tie their fossils more closely to our 
oxygen proxies,’ says Lyons. It will mean deci- 
phering what oxygen levels were in different 
ancient environments, and connecting those 
values with the kinds of traits exhibited by the 
animal fossils found in the same locations. 

Late last year, Woods visited Siberia with 
that goal in mind. She collected fossils of Clou- 
dina and another skeletonized animal, Suvo- 
rovella, from the waning days of the Ediacaran. 
The sites gave her the chance to gather fossils 
from many different depths in the ancient 
ocean, from the more oxygen-rich surface 
waters to deeper zones. Wood plans to look 
for patterns in where animals grew tougher 
skeletons, whether they were under attack by 
predators and whether any of this had a clear 
link with oxygen levels. “Only then can you 
pick out the story.” m 


Douglas Fox is a journalist in northern 
California. 
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Kepler-186f, the first known 
Earth-sized exoplanet in 

a star’s habitable zone 
(artist’s impression). 


Astronomers are beginning to glimpse what 
exoplanets orbiting distant suns are actually like. 


BY JEFF HECHT 


The trickle of discoveries has become a torrent. 

Little more than two decades after the first planets were found orbiting 
other stars, improved instruments on the ground and in space have 
sent the count soaring: it is now past 2,000. The finds include ‘hot 
Jupiters; ‘super-Earths’ and other bodies with no counterpart in our 
Solar System — and have forced astronomers to radically rethink their 
theories of how planetary systems form and evolve. 

Yet discovery is just the beginning. Astronomers are aggressively 
moving into a crucial phase in exoplanet research: finding out what 
these worlds are like. Most exoplanet-finding techniques reveal very 
little apart from the planet’s mass, size and orbit. But is it rocky like Earth 
or a gas giant like Jupiter? Is it blisteringly hot or in deep-freeze? What is 
its atmosphere made of? And does that atmosphere contain molecules 
such as water, methane and oxygen in odd, unstable proportions that 
might be a signature of life? 

The only reliable tool that astronomers can use to tackle such 
questions is spectroscopy: a technique that analyses the wavelengths 
of light coming directly from a planet's surface, or passing through its 
atmosphere. Each element or molecule produces a characteristic pattern 
of ‘lines’ — spikes of light emission or dips of absorption at known 
wavelengths — so observers can look at a distant object’s spectrum to 
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read off what substances are present. “Without spectroscopy, you are to 
some extent guessing what you see,” says Ian Crossfield, an astronomer 
at the University of Arizona in Tucson. 

But spectroscopy has conventionally required a clear view of the 
object, which is generally not available for exoplanets. Most new worlds 
show up only as an infinitesimal dimming of a star as the otherwise 
invisible planet passes across its face; others are known only from the 
slight wobble of a star being tugged back and forth by the gravity of an 
unseen companion. Astronomers often say that trying to study such an 
object is like staring into a far-off searchlight (the star) and trying to see 
a firefly (the planet) hovering nearby. 

In recent years, however, observers have begun to make headway. Some 
have extracted the spectra of light passing through the atmospheres of 
exoplanets as they cross the face of their parent stars — the equivalent of 
measuring the colour of the firefly’s wings as it flits through the searchlight 
beam. Others have blocked the light of the parent star so that they can see 
exoplanets in distant orbits and record their spectra directly. 

In the past two years, astronomers have begun to record spectra from 
a new generation of custom-built instruments such as the Gemini Planet 
Imager on the 8.1-metre Gemini South telescope at the summit of Cerro 
Pachon in Chile. Exoplanet spectroscopy will be a priority for several 
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spacecraft and ground-based telescopes that are now in development. 
And astronomers are waiting eagerly for NASA's James Webb Space Tel- 
escope (JWST), which will bring unprecedented light-gathering power 
and sensitivity to the task when it launches in 2018. 

These are heady times for those hoping to get a deep understanding of 
new-found worlds, says Thayne Currie, an astronomer at Japan’s Subaru 
Telescope on Mauna Kea, Hawaii. “We are on the cusp of a revolution” 


TRANSIT SPECTROSCOPY 

The first exoplanet in orbit around a Sun-like star was discovered in 
1995, when astronomers Michel Mayor and Didier Queloz of the Geneva 
Observatory in Switzerland detected a regular, back-and-forth wobble in 
the movement of star 51 Pegasi. They concluded’ that it was caused by the 
gravity of a planet at least 150 times the mass of Earth — roughly half the 
mass of Jupiter — orbiting the star every 4 days or so. Other discoveries 
followed as exoplanet fever took hold, and led telescope managers to 
make more observing time available for planet-hunting. 

The list of finds soon sparked an idea for astronomer David 
Charbonneau of the Harvard-Smithsonian Center for Astrophysics in 
Cambridge, Massachusetts. He reasoned that when a planet ‘transits, or 
passes in front of a star, molecules in its atmosphere will absorb some 
of the starlight, and leave their spectroscopic fingerprints in it. Might it 
be possible to detect those fingerprints? 

To find out, Charbonneau decided to look for sodium. “It’s not 
particularly abundant,” he says, “but sodium has very clear spectro- 
scopic features” — excited molecules of it emit two very strong lines 
of light, which give sodium street lights their familiar yellow-orange 
colour. When the sodium is backlit, the light that floods through it has 
dark bands at the same points of the spectrum, 
and Charbonneau hoped that these would be 
comparatively easy to spot. 

They were: in 2002, Charbonneau and his 
co-workers announced’ that they had used 
the Hubble Space Telescope to detect a sodium 
signal from a Jupiter-sized exoplanet transiting 
HD 209458, a star about 47 parsecs (150 light 
years) from Earth. It was both the first detec- 
tion and the first spectroscopic measurement 
of an exoplanet atmosphere. Within a few years, 
space-based transit observations were record- 
ing more complete spectra, and detecting gases 
such as carbon monoxide and water vapour. 

Using this technique means looking for very tiny changes in a star’s 
spectrum, says Charbonneau — maybe | part in 10,000. Hubble was and 
is observers’ first choice of instrument: it does not have to contend with 
absorption of light by gases in Earth’s atmosphere, so its spectra are very 
clean and easy to interpret. But competition for observing time is intense, 
so astronomers also use ground-based telescopes. 

These do have to deal with atmospheric interference, but can 
overcome it by collecting more light than Hubble can. This allows them 
to detect fainter objects and to separate individual spectral features more 
clearly. That pays off because most exoplanets are in star systems that 
are moving relative to Earth. “So their wavelengths are Doppler-shifted,” 
says Charbonneau, meaning that the radiation coming from them is 
stretched or squeezed by their movement, displacing the spectral lines 
slightly from the corresponding lines in Earth’s atmosphere. Because the 
two sets of spectral lines no longer overlap, observers can know for sure 
how much of the signal comes from the exoplanet. Using this method, 
astronomers have been able to detect gases making up as little as 1 part 
in 100,000 ofa planet's atmosphere. 

An extension of the transit-spectroscopy technique has allowed 
astronomers to measure the light reflected from a planet’s face. They 
do this after the planet moves across the face of its star, when it will 
be on the far side of its orbit, with its daylight side facing Earth (see 
‘Star shades’). Observers will not be able to see it as a separate object 
— but they will know that its spectrum is combined with that of the 
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star, says Nicolas Cowan, an astronomer at the McGill Space Institute 
in Montreal, Canada. Shortly afterwards, however, the planet will pass 
behind the star and be eclipsed — at which point, says Cowan, “you 
go froma planet and star to just a star. Ifyou measure the difference in 
flux, you can tell how much light comes from the planet.” The process is 
demanding, he says, but it can measure the infrared spectra of a Jupiter- 
sized planet ina close orbit even ifit is less than 0.1% as bright as the star. 

An even more ambitious application of this technique is to follow an 
exoplanet through a complete orbit. By subtracting the star-only spec- 
trum obtained during the planet’s eclipse, observers can get spectra of 
the planet’s atmosphere as its silhouette changes from a thin crescent just 
after transit to a half-moon shape as it swings to the side, then a full-face 
view on the far side. This allows them to produce a comparatively fine- 
grained map of the atmosphere and how it changes over time. Cowan and 
his co-workers first reported’ using this technique in 2012, with infrared 
data from NASAs Spitzer Space Telescope. They showed that the exo- 
planet HD 189733b was hottest within about 10 degrees ofits equator, as 
predicted. Since then, other researchers have used Hubble and Spitzer* to 
map exoplanet atmospheres in more detail. And Cowan says that with the 
JWST, “it will be easy to make a3D map of the atmosphere ofa hot Jupiter” 

Transit spectroscopy does have its limitations. Some exoplanets have 
nearly featureless spectra characteristic of clouds, which consist of drop- 
lets or fine dust particles that do not leave their imprint on the spectrum 
in the same way as isolated molecules’. The clouds are a big headache, 
says Charbonneau. “We don’t have any direct measurement of what the 
clouds are made of. We just know they block the light.” They aren't nec- 
essarily made of water vapour. Charbonneau points out that the cloud- 
shrouded super-Earth GJ 1214b, 12 parsecs from Earth, is so hot that 
its clouds could be made of zinc sulfide and 
potassium chloride. On still hotter worlds, the 
clouds could contain droplets of iron or rock. 

Lisa Kaltenegger, director of the Carl Sagan 
Institute at Cornell University in Ithaca, New 
York, points to another limitation of the transit 
method. “When light hits a transiting planet, it 
isn't just absorbed,’ she says. “It also gets bent 
in the atmosphere’, making it impossible for 
an observer on Earth to see. This bending, 
known as refraction, increases as the atmos- 
phere becomes thicker. If alien astronomers 
were trying to get a spectroscopic reading of 
Earth, she says, refraction would prevent them 
from probing any deeper than 10 kilometres from the surface®. But most 
of Earth’s water is in the lowest 10 kilometres of its atmosphere, she says 
— so by analogy, “water is going to be one of the hardest things to find 
in an Earth-like exoplanet”. 


DIRECT IMAGING 
An alternative approach to finding and studying exoplanets is trying to 
block out the starlight and image them directly, the equivalent of looking 
for the firefly by holding a hand in front of the searchlight. Early efforts 
to do this were futile: even the dimmest parent star is much brighter than 
an exoplanet. The secret of success is to seek brighter fireflies wander- 
ing well away from the searchlight — that is, young planets still glowing 
from the heat of formation, in orbits far from their stars. The first directly 
imaged exoplanets were announced by two groups simultaneously in 
2008. The objects included 3 planets about 60 million years old orbiting 
the star HR 8799 (ref. 7), and a single planet more than 100 million years 
old orbiting Fomalhaut (ref. 8), a bright star some 8 parsecs from Earth. 

To obtain the spectra of such objects, astronomers turned to adap- 
tive optics, a technology that corrects for the twinkling ofa star caused 
by turbulence in Earth’s atmosphere and makes it much easier to spot 
any exoplanets in its vicinity. Also essential are discs inserted into the 
telescope’s optical pathway to block light from the star, and sophisticated 
signal processors to digitally sharpen the images. 

“Direct-imaging spectra are beautiful and tell you a lot about the 
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The planet’s contribution to the 


If a planet crosses the face of its star, 
astronomers can obtain its spectrum by 
subtracting the spectrum of the star alone. 


CRESCENT 

Just before the planet 
starts to transit the star, 
its spectrum comes from 
a thin sliver of its disk. 


A disc inside the telescope 
can block enough of the 
star’s light to allow imaging 
of planets in large orbits. 


20 astronomical 
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planets and how they formed,” says Bruce Macintosh, an astronomer at 
Stanford University in California and a co-discoverer of the HR 8799 
planets. In 2011, he and his colleagues reported’ the first detection of 
water vapour on one of those planets using a first-generation direct- 
imaging instrument that could observe only exoplanets with tem- 
peratures higher than 1,000 kelvin. Now, Macintosh is the principal 
investigator for the Gemini Planet Imager, which, along with the simi- 
lar Spectro-Polarimetric High-contrast Exoplanet Research (SPHERE) 
imager at the European Southern Observatory’s Very Large Telescope 
in Chile, is a second-generation instrument built to directly image and 
take spectra of exoplanets down to about 600 kelvin. 

The Gemini instrument launched a multiyear search for Jupiter-like 
planets orbiting hot, young stars in November 2014. Early observations 
of 51 Eridani, a 20-million-year-old star about 30 parsecs away, spotted a 
Jupiter-like world 2.5 times farther from the star than Jupiter is from the 
Sun”. The spectrum showed that this exoplanet, dubbed 51 Eridanib, 
has an atmosphere containing more methane — a known component 
of Jupiter’s atmosphere — than any other exoplanet. “The really exciting 
thing with 51 Eridani b and other new exoplanets,” says Currie, “is that 
we see them when their spectra look a little more normal” and Jupiter-like 
than those of planets that are even younger and hotter, where methane is 
strangely absent. That could provide crucial insight into planet formation, 
the current theory of which is based mostly on data from the Solar System. 

SPHERE has embarked on a similar survey, but started later, in Febru- 
ary 2015, and has less to report. Thus far, says team member Anthony 
Boccaletti, an astronomer with the Paris Observatory, the most interesting 
discovery" is a group of five gas clumps moving at high velocity away from 
the young star AU Microscopii, which is known to be unusually prone 
to flares and other activity. “We don't really know what they are,” he says. 


STAR SURVEYS 
Exoplanet spectroscopy has come a long way from its early days, when 
practitioners were struggling to extract extremely faint signals from noisy 
environments. The first results were often problematic. Now, Crossfield 
says, “for the most part what we are finding holds up and is repeatable”. 
A coming generation of instruments promises to reveal even more. 
NASA’ Transiting Exoplanet Survey Satellite (TESS), scheduled to 
launch in August next year, will spend two years searching for exo- 
planets transiting more than 200,000 of the brightest stars in the solar 
neighbourhood. Exoplanets will also be targets for the JWST. With its 
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6.5-metre telescope and advanced instruments, Webb should see many 
more than the 2.4-metre Hubble. “TESS and Webb will own this space 


in five years,’ predicts Macintosh. 


Two other planned — but not yet approved — space missions will use 
exoplanet spectroscopy. NASA‘ 2.4-metre Wide Field Infrared Survey 
Telescope, expected to launch in the mid-2020s, would spend most of its 
time on cosmological questions, but is expected to find and study about 
2,600 exoplanets. Currie says that it should be able to image Jupiter-like 
planets orbiting nearby stars, although smaller, colder bodies similar 
to Pluto or the hypothetical ‘Planet X’ speculated to exist at the edge 
of the Solar System — or Earth, for that matter — will remain out of 
reach. “We would need a 10-metre-scale telescope in space to do other 


Earths,’ says Macintosh. 


The second mission is ARIEL, the Atmospheric Remote-Sensing 
Infrared Exoplanet Large-survey, one of three candidates for a medium- 
class mission to be launched by the European Space Agency in 2026. 
The 1-metre telescope would be dedicated to transit spectroscopy and 


a survey of exoplanets at temperatures higher than 500 kelvin. 


In about a decade, astronomers hope to see the completion of three 
super-giant telescopes: the 24.5-metre Giant Magellan Telescope at 
the Las Campanas Observatory in Chile, the Thirty- Meter Telescope 
planned for Mauna Kea, and the European Extremely Large Telescope 
on Cerro Armazones in Chile. All three will be equipped with adap- 
tive optics systems, and it’s a safe bet that they will be doing exoplanet 
spectroscopy to test models based on the data gleaned up to that point. 

Those measurements could be astronomers first realistic chance to 


find life in the wider Universe, says Charbonneau. “I’m so excited.” m 


Jeff Hecht is a freelance writer in Auburndale, Massachusetts. 
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A shipbreaking yard in Chittagong, Bangladesh. Several workers in the country’s scrapping industry are injured every week. 


Three steps to a green 
shipping industry 


It is time to crack down on the emissions and destructive development caused by vast 
container vessels that pollute the air and seas, write Zheng Wan and colleagues. 


n 26 April 1956, US entrepreneur 
() ion McLean watched a con- 

verted oil tanker leave Port Newark 
in New Jersey carrying 58 of his inventions: 
the modular shipping container. By 2015, 
the largest container ship in the world, witha 
deck the area of 3.5 soccer fields, could carry 
about 20,000 of the units. 

Ever-bigger container ships carry 90% 
of global consumer goods such as clothes 
and food (non-bulk cargo)’. The seaborne 
container trade has grown from 100 million 


tonnes in 1980 to about 1.6 billion tonnes 
in 2014. Standardized 20-foot (6-metre) 
containers are moved using automated sys- 
tems that connect seaports, airports and train 
stations’. Bigger ships carry more containers, 
ideally consuming less oil and releasing fewer 
pollutants for each unit of goods carried. 
Nonetheless, the human and environ- 
mental costs of shipping are vast. Low-grade 
marine fuel oil contains 3,500 times more 
sulfur than road diesel. Large ships pollute 
the air in hub ports, accounting for one-third 


to half of airborne pollutants in Hong Kong, 
for example’. Particulates emitted from 
ships cause 60,000 cardiopulmonary and 
lung-cancer deaths each year worldwide’. 
Expanding harbours to take vast ships 
destroys coastal ecosystems. And scrap- 
ping fleets of obsolete smaller ships pollutes 
seas and soils, and damages workers’ health, 
especially in the developing world’. 

The industry is at a crossroads. The 
expected profits from larger ships are being 
undermined by excess capacity, slowing > 
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> trade and plunging transport prices. In 
2015, container freight rates for the world’s 
busiest shipping route — between Asia and 
northern Europe — dropped by nearly 60% 
in three weeks. A dozen shipping compa- 
nies went bankrupt, including Denmark's 
Copenship and China’ Nantsing. Even the 
giant container-conveying Danish conglom- 
erate Maersk announced that it would lay 
off 4,000 employees by 2017 and delayed or 
cancelled orders to build mega-ships. 

Companies face a dilemma. If they buck 
the trend of scaling up, they risk being less 
competitive. Yet running mega-ships only 
part full wipes out the benefits of economies 
of scale. Ships use more fuel per container 
when half-loaded than for a full cargo. 

The future is green shipping: efficient 
marine transport with minimal health and 
ecological damage’®. Cleaner practices — 
especially on ship scrapping, emission con- 
trol and port management — are needed. 
Achieving this will require heroic efforts 
by the industry and its engineers in col- 
laboration with regulators, port authorities 
and communities. Environmental impacts 
should be considered in determining opti- 
mal routes and modes for delivery of goods. 


POLLUTION PROBLEM 

Shipping is the most energy-efficient way to 
move large volumes of cargo. Yet ships emit 
nitrogen oxides (NO,), sulfur oxides (SO,), 
carbon dioxide and particulate matter (PM) 


SUPERSIZE SHIPS 


into the atmosphere. Worldwide, from 
2007 to 2012, shipping accounted’ for 15% 
of annual NO, emissions from anthropo- 
genic sources, 13% of SO, and 3% of CO,. 
In Europe in 2013, ships contributed 18% of 
NO, emissions, 18% of SO,.and 11% of parti- 
cles less than 2.5 micrometres in size (PM, ;). 
For road transport, the figures were 33%, 0% 
and 12 %, respectively. Aviation, by contrast, 
accounted for only 6%, 1% and 1%, respec- 
tively, and rail just 1%, 0% and 0%. 

Shipping policies must be applied world- 
wide to be effective. Shipping and aviation 
emissions are not addressed by global cli- 
mate-change agreements, including the deal 
made in Paris last December. The Interna- 
tional Maritime Organization (IMO), which 
regulates international shipping, is engaging 
— slowly. Releases into the oceans of oils, 
noxious liquids, harmful substances, sew- 
age and garbage have been restricted since 
the 1980s by the International Convention 
for the Prevention of Pollution from Ships 
(MARPOL), following a spate of oil-tanker 
accidents. Air-pollution limits for shipping 
were adopted in 1997 but came into force 
only in 2005. 

Energy efficiency is the IMO’s present 
focus. Starting in 2013, its Energy Efficiency 
Design Index and Ship Energy Efficiency 
Management Plan aim to lower CO, emis- 
sions from shipping through tighter techni- 
cal requirements on engines and equipment, 
maintenance regimes and voyage plans. No 


Each generation of container ship is getting bigger as economies of scale are expected to bring 
down transportation costs. But the largest rarely carry a full cargo, and pollute hub ports. 
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absolute emissions-reduction targets were 
set. Unfortunately, long-term expansion 
in global trade and growing ship numbers 
mean that even if these measures are fully 
implemented, total shipping emissions are 
projected to quadruple from 1990 to 2050°. 

The IMO has set up four ‘emission-control 
areas’ — the Baltic Sea, the North Sea, the US 
Caribbean and the coastal waters of Canada 
and the United States — where ships are 
required to minimize emissions mainly of 
SO, and NO,. These regions exclude the 
world’s ten largest container ports, such as 
the Chinese ports of Shanghai, Shenzhen, 
Hong Kong and the South Korean port of 
Busan, which are all in Asia (see “The dirty 
ten’). We estimate that these ten sites alone 
contribute 20% of port emissions worldwide. 

A few developed countries, including the 
United States, the United Kingdom and Nor- 
way, limit the sulfur content of marine fuel in 
their national waters to within 1,000 parts per 
million (p.p.m.). Most developing countries, 
including India and China, permit dirtier 
fuels with 35,000 p.p.m. of sulfur. The Euro- 
pean Union fuel standard for cars is 10 p.p.m. 

Ship scrapping is heavily polluting. Asbes- 
tos, heavy metals and oils are toxic. Workers 
are exposed to hazardous fumes. The EU 
has laws requiring that ships registered in 
Europe be broken up only in licensed yards 
that meet strict guidelines. But it is easy to 
change a ship’s registration and demolish 
it in a country with a more lax approach to 
labour and environmental protection. 

India, Bangladesh, and Pakistan are 
popular for ship scrapping’. In Bangladesh 
for example, 40,000 mangroves — trees 
that stabilize many tropical coasts and are 
habitats and breeding grounds for many spe- 
cies — were chopped down in 2009 alone 
to accommodate shipbreaking yards. The 
pollution from scrapping there has caused an 
estimated 21 fish and crustacean species to 
become extinct. And reportedly, each week 
one worker dies and seven are injured in the 
scrap yards of Bangladesh. 

Congestion adds to pollution and disrup- 
tion. Large volumes of cargo overwhelm 
ports, surrounding roads and waterways. 
Hasty expansion or construction of berths 
and canals to take more large ships can be 
environmentally disastrous. Where the water 
in existing harbours is too shallow, port 
authorities may reclaim land from the sea or 
build artificial islands in deeper waters. 

Coastal changes destroy ecosystems. 
Over the past three decades, about 75% of 
mangroves have disappeared from Shen- 
zhen, following port expansion and land 
reclamation. Plans for the Porto Sul port in 
Brazil — slated to open in 2019 — identified 
36 potential environmental impacts, includ- 
ing driving away dolphins and whales and 
killing seabed fauna. 

Traditional shipping routes cannot keep 
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quality along shipping lanes. Emissions-control zones omit the ten largest container ports, which 
contribute an estimated 20% of worldwide port emissions of nitrogen oxides and sulfur oxides. 
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up. The Panama Canal, which connects the 
Pacific and Atlantic oceans, can currently 
handle vessels carrying only up to about 
5,000 standard containers. A project to 
expand it to accept ships with 13,000 con- 
tainers (the ‘New Panamax’ class) should 
be completed by May. But the largest mega- 
ships, such as Maersk’s E-class and Triple 
E-class (with capacities between 14,000 
and 18,000 containers), will still be unable 
to cross (see ‘Supersize ships’). In the mean- 
time, heavy traffic at Panama, complicated 
navigation and constant maintenance have 
led to a ten-day delay in voyage times. 

To take advantage of the business oppor- 
tunity, construction is scheduled to start this 
year on a 280-kilometre-long canal through 
Nicaragua. This US$50-billion project, 
funded by a billionaire-owned Hong Kong 
company, could destroy almost 400,000 hec- 
tares of tropical forests and wetlands, home 
to threatened and endangered wildlife and 
indigenous communities”. 

Public concern about the pollution and 
health impacts of shipping remains muted 
because the industry is a backbone of the 
global economy, and its activities happen 
far from where most people live and often 
beyond the jurisdiction of local regulators. 
We cannot rely only on new ship designs and 
engine innovation to minimize the ecologi- 
cal footprint of shipping: today’s ships might 
be in use for another 20 years or more. Sev- 
eral issues must be addressed together to 
make the industry greener. 


GREEN SHIPPING 

Implementing the following recommen- 
dations could save thousands of lives each 
year, ensure cleaner coastal air and reduce 
ecological damage from shipping. 


Clean up ship scrapping. The IMO adopted 
the Hong Kong International Convention 
for the Safe and Environmentally Sound 
Recycling of Ships in 2009, but only Norway, 
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Congo and France have acceded as of Feb- 
ruary 2016. The IMO’ priority should be to 
ensure that the principal scrappers — India, 
Bangladesh and Pakistan — adhere to these 
guidelines. The first step is to set up local 
offices in these countries to collect and ana- 
lyse monitoring data independently and to 
propose improvements to local governments. 
International loan or aid programmes to these 
countries, sponsored by the World Bank or 
the Asian Development Bank, for example, 
should demand clean ship-scrapping prac- 
tices as an incentive. To discourage transfer 
of scrapping elsewhere, a watch list of poorly 
performing countries needs to be updated by 
IMO regularly until an international conven- 
tion enters into force. 


Control emissions. Stricter IMO emissions 
regulations are needed, including a cleaner 
worldwide standard for sulfur released by 
combustion of marine fuel. A 97% cut in 
SO,, can be achieved by reducing the sulfur 
content from 35,000 p.p.m. to 1,000 p.p.m. 
fuel oil. Today’s low oil price provides a great 
opportunity for this transition to happen. 
The current cost of 1,000-p.p.m.-grade fuel 
oil (around US$300 per tonne in Singapore, 
for example) is less than half of that of the 
cheapest dirty fuel four years ago. 

Marine fuel is a sideline for oil refiner- 
ies — only 2-4% of the total fuel market. 
Stricter emissions standards will stimu- 
late demand for high-quality fuel. Incen- 
tive programmes (tax rebate and subsidies 
for producers) will be needed to ensure a 
reasonable profit margin to recover the 
initial high investment in developing coun- 
tries, where there is little current capacity. 
Government interventions will be needed 
in countries with state-run oil companies, 
such as in China and India. 

An alternative is to install scrubbers for 
exhaust-gas cleaning on ships. Scrubber units 
blend the exhaust gas with water or caustic 
soda to remove up to 99% of SO, and 98% 


of particulate matter from high-sulfur fuel. 
At the moment, scrubbers are expensive, 
costing $2 million for one ship. But China, 
for instance, could equip its entire container 
fleet in one year by funding a 50% subsidy 
for scrubbers. The total cost? Just 0.5% of the 
$150 billion per year it has spent since 2013 
to fight pollution. Shipping companies could 
recoup the other 50% in one year from fuel 
savings. With a stricter emissions standard, 
the demand for scrubbers would go up, and 
the costs down, as production scales. 


Improve port management. Port authori- 
ties should review the environmental impact 
of their previous construction and disclose 
information on their future development 
plans to demonstrate responsible manage- 
ment of public assets. They should coor- 
dinate with transport-planning bureaus to 
seek the most economical and environmen- 
tally friendly strategy to dispatch goods; the 
optimal capacities of its terminals; and how 
to assist ships to load and unload quickly. 
Making port-business statistics and the 
results of environmental-impact studies 
accessible will allow the research commu- 
nity to be involved in the decision-making 
process. Environmental non-governmental 
organizations should campaign to increase 
public awareness of port development. 

After decades of loose oversight, it is time 
for shipping to get a whole lot greener. m 


Zheng Wan is associate professor, Mo Zhu 
is assistant professor and Shun Chen 

is associate professor at the College of 
Transport and Communications, Shanghai 
Maritime University, Shanghai, China. 
Daniel Sperling is distinguished professor 
of civil engineering and environmental 
science and policy, and founding director of 
the Institute of Transportation Studies, at the 
University of California, Davis, USA. 
e-mail: mrwan@ucdavis.edu 
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Radiant realms 


Philip Ball enjoys two explorations of light, spanning 
wonders from Newton’s spectrum to the aurora borealis. 


is now history, but light remains high 

on the scientific agenda. From photo- 
synthesis to quantum optics and light-emit- 
ting organic compounds, there is no getting 
away from it, even if you wanted to. Bruce 
Watson's Light and Melanie Windridge’s 
Aurora — one encyclopaedic, the other more 
focused — are as eloquent about the aes- 
thetic pleasures of light as they are about the 


| ast year’s International Year of Light 


The northern lights over Norway — produced by the solar wind and Earth’s magnetic field. 


scientific issues that it presents. The tempta- 
tion to call them lucid and illuminating only 
highlights (there I go again) the linguistic 
currency of this fundamental phenomenon. 
In Light, Watson offers a wide-ranging tour 
of the cultural, literary and scientific response 
to his subject. Our associations with light 
seem almost unrelentingly favourable. Light 
is divine, dark demonic. “God Appears & God 
is Light/To those poor Souls who dwell in 
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Night’; wrote William Blake in 1803, in ‘Augu- 
ries of Innocence. What could seem more 
natural than this dualism? We have sought 
for centuries to understand and use light, and 
never stopped worshipping it. Watson seems 
to acknowledge that by bookending his study 
with contemporary sunrise ceremonies at the 
prehistoric stone monuments of Stonehenge 
in England and Newgrange in Ireland. 

The stops along the way will surprise no 
one who has a passing acquaintance with the 
traditions and science of light. The dazzling 
list includes creation myths, Aristotle’s 
meditations on vision and ether, the light 
metaphysics of medieval Neoplatonism that 
identified God with radiance and informed 
Gothic architecture, the tonal contrast of 
light and dark (chiaroscuro) developed 
by Leonardo da Vinci, the work of early- 
modern scientists from Johannes Kepler to 
Isaac Newton, polymath Thomas Young's 
wave theory of 1800, painter J. M. W. Turn- 
er’s last words (“The Sun is God”), Thomas 
Edison’s light bulb, Albert Einstein’s quanta 
and the laser. Even when covering familiar 
ground, Watson's touch is lyrical and deft. 

Romantic poet John Keats famously 
declared in his 1884 epic poem Lamia that 
the progression from uncomprehending 
wonder to scientific understanding had 
unwoven the rainbow. He underestimated 
light’s enduring appeal. Light might now 
be just a little slice of a continuous electro- 
magnetic spectrum; trapped in optical fibres 
and sent hither and thither at our bidding; 
chopped into quanta and parted into electri- 
cal and magnetic oscillations. But it has not 
lost its primeval power to enchant. Modern 
artists make astonishing light sculptures, 
from James Turrell’s subliminal illusions to 
Carlos Cruz-Diez’s disorienting chromati- 
cally saturated light-rooms. Coloured light is 
still used to summon contemplation, as seen 
in the stained glass of Henri Matisse’s mid- 
twentieth-century Chapel of the Rosary in 
Vence, France. And the crowds that Watson 
joined at Neolithic monuments are not hippie 
throwbacks. They are simply people eager to 
experience the awe ofa primal event — the 
appearance of the Sun’s first rays. 

Inany case, it is foolhardy to assert that light 
is scientifically understood. It is more like a 
beam guiding us to deeper mysteries. New- 
ton’s experiment with prisms in a “darkend 
chamber” to reveal the spectrum of sunlight is 
often presented as a turning point, celebrated 
by Alexander Pope in his “Epitaph on Sir Isaac 
Newton’: “Nature and Nature's Laws lay hid in 
Night/God said, “Let Newton be!” and all was 
light” But physicists including Young, James 
Clerk Maxwell and Einstein all radically 
altered Newton's picture — a picture that is 
still shifting, with the use of quantum optics to 
explore the fundamentals of quantum theory 
and reveal the strange connections that may 
persist between photons. 


BJ@RN JORGENSEN 


AP/PRESS ASSOCIATION 


Light: A Radiant History from Creation to the 
Quantum Age 

BRUCE WATSON 

Bloomsbury: 2016. 


Aurora: In Search of the Northern Lights 
MELANIE WINDRIDGE 
William Collins: 2016 


Watson's book is an eye-catching display, 
reflecting and refracting like a gemstone. One 
moment we are among late-nineteenth-cen- 
tury Parisians watching the cinematograph of 
the Lumiére brothers (was there ever a better 
example of nominative determinism?). The 
next we are examining Rembrandt's 1642 
painting The Night Watch — which depicts 
a militia emerging into the light — or com- 
miserating with modern astronomers about 
light pollution. This is all tremendous fun, 
but never quite focuses into a coherent image. 
Perhaps light has too many facets to offer uni- 
fying themes. However, like other mythologi- 
cally charged substances (such as water and 
gold), it does not easily shed its associations 
under scientific scrutiny. That may be why 
transformation optics has become a ‘tech- 
nology of invisibility’ and quantum optics 
claims to promise teleportation. The electro- 
magnetic ether was once considered a possi- 
ble bridge between the physical and spiritual 
worlds; in a way, light still holds that allure. 

Nowhere is that more apparent than in 
the phenomenon of the aurora borealis, or 
northern lights, produced when subatomic 
particles in the solar wind collide with mol- 
ecules in the atmosphere at Earth’s magnetic 
pole, stimulating emissions of light. Plasma 
physicist Melanie Windridge embraces the 
aurora’s magical quality even as she attempts 
to unravel the science. She has camped at tem- 
peratures approaching —40 °C in Svalbard, 
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Sunlight streaming through windows at New York City’s Grand Central Terminus. 


Norway's Arctic archipelago, to experience 
the visceral thrill of seeing these shimmering 
curtains. It is clear in this captivating book 
that her technical understanding has not 
dimmed her delight. 

Windridge’s account explains what we do 
and do not yet know about the auroras, and 
how this understanding has evolved. The 
journey takes in the restless, violent activity 
of the Sun’s outer layers, the physics of interac- 
tion between the plasma of the solar wind and 
Earth’s magnetic field, the atomic physics that 
generates the various colours of the aurora 
(oxygen produces the predominant green 
and red, nitrogen the rarer blue and purple) 
and the potential of auroral monitoring (for 


instance, of changes in shape) to provide 
advance warning of disruption to telecommu- 
nications caused by stormy “space weather”. 

What emerges is the tremendous difficulty 
of formulating intuitive and predictive mod- 
els of a turbulent and chaotic process. The 
auroral morphologies are described even now 
in qualitative terms — arcs, veils, rays and so 
forth — anda full theory is still lacking. Win- 
dridge makes a persuasive case that the quest 
to find one is both pragmatic and inspired by 
the visual majesty of nature’s light. = 


Philip Ball is a writer based in London. His 
latest book is Invisible. 
e-mail: p.ball@btinternet.com 


Broken bridges and highways from hell 


Kyle Shelton applauds a study that probes the parlous state of US infrastructure. 


Henry Petroski surveys the state of US 

bridges, roads and tunnels — the legacy 
of two centuries of technological develop- 
ment — and finds them crumbling. Echoing 
Robert Frost’s poem “The Road Not Taken; 
Petroski reflects on both physical highways 
and the choices that contributed to their 
current state. As Frost wrote, “Two roads 
diverged in a wood, and I—/I took the one 
less traveled by,/And that has made all the 
difference.” Petroski compellingly shows that 
only by closely investigating the figurative 
roads down which the country has travelled, 


I n The Road Taken, engineer and historian 


and the decisions made at their forks, can the 
past be used to shape the future. 

Some of those paths, he reveals, have been 
revolutionary — from the advent of asphalt 
for road construction in 1870 to the quest 
for autonomous cars. Some have thrown up 
obstacles that affect future choices: expen- 
sive projects, huge maintenance demands, 
short-sighted political 
decisions and lacklus- 
tre design. Petroski 
shows how the choices 
of policymakers and 
engineers contributed 
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to soaring successes such as the Golden Gate 
Bridge in San Francisco, California, and 
tragic failures including the 2007 collapse of 
the I-35W bridge in Minneapolis, Minnesota, 
which killed 13 and injured 145. 

A typical history of US transport infra- 
structure would begin with the travel revo- 
lution of the early nineteenth century. This 
spawned the steamship, New York’s Erie 
Canal and the first federally funded highway, 
the National Road. Then followed railways, 
bicycles and the early-twentieth-century 
boom in automobiles and road construction. 

Petroski touches on much of this, as well as 
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infrastructure below ground. In an opening 
vignette about his Brooklyn childhood, he 
highlights the contradictory nature of all that 
cabling and piping — simultaneously invis- 
ible and utterly necessary, the ever-present 
context of our daily lives, carrying electricity, 
waste, water and gas to and from our homes. 
But his focus is the structures and elements 
of the ever-evolving road. 

He delves into the progression of road 
surfaces from carefully stacked stone to tar- 
mac, concrete and asphalt, in the unending 
quest to make smooth, durable surfaces. 
He shares the history of modal conflict — 
between horse-powered vehicles, pedestri- 
ans and automobiles — by looking at how 
safety features and traffic signals developed. 
Finally, he discusses materials failure not only 
through how steel and concrete deteriorate, 
but by connecting that decay to financial and 
political choices that undervalued risk. There 
are more than 6 million kilometres of public 
roads in the United States, and its citizens 
travel nearly 5 trillion kilometres a year on 
them. But as the American Society of Civil 
Engineers reported in its 2013 ‘report card’ 
on US infrastructure, 66,749 US bridges (1 
in 9) are structurally deficient, and 32% of 
US roads are in “poor or mediocre condition”. 

Petroski shows how past practices ripple 
forwards. Details paint a rich, sometimes 
pungent, history: for example, woodblock 
roads (first used widely in the mid-nine- 
teenth century) fell out of favour because they 
absorbed horse urine. On the country’s noto- 
rious scourge of potholes, Petroski describes 
how the combined forces of heavy traffic, 
water and freezing temperatures conspire 
to create spreading cracks and distortions in 
asphalt. He concludes by casting forwards to 
future roads. Here, he opines, self-driving cars 
will create safer roads by calculating hazards 
faster than human drivers can. Asphalt could 
be revolutionized by mixing in fibres of steel 
wool; when these are heated by induction 
coils on a special vehicle, the asphalt will melt 
and reform, becoming self-mending. 

Petroski’s goal is to ask how, given the 
importance of the car to the US economy 
and mobility, federal and state governments 
have allowed the coun- 
try’s infrastructure to 
reach crisis point. But 
he goes beyond hand 
wringing. With an 
engineer’s technical 
knowledge and a his- 
torian’s eye, he offers 
a nuanced argument 
about the political, 
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The heavily used I-35W bridge in Minneapolis, Min 


Petroski urges political engagement. He cites 
the controversial rebuilding of the eastern 
span of the Bay Bridge in Oakland, California, 
which opened in 2013. This cost a whopping 
US$6.4 billion, took 12 years to build, was 
beset by construction problems and was 
arguably over-designed. He also discusses the 
disputes over the Department of Transporta- 
tion’s Highway Trust Fund, intended as the 
primary support for federal investment in US 
road systems, but due to run out of money this 
June. Politicians, he shows, too often choose 
expedience over sound solutions. 

Despite that focus, Petroski does leave 
out many of today’s crucial political debates. 
There is almost no discussion of the local 
decisions that shape US roads, such as where 
or how state and city officials decide to apply 
federal and state funds, or how munici- 
palities shore up their road funds through 
different financing mechanisms. Nor does 
Petroski delve into the vexed issue of the 
displacements caused by interstate con- 
struction since the 1950s, which at its peak 
forced between 60,000 and 100,000 people to 
leave their homes every year. Finally, despite 
framing the book as a look at choices made 
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nesota, collapsed in 2007. 


and not made, he does not comment on the 
possibility that roads and their continued 
construction are creating further problems. 
As politicians and citizens consider how 
to fix existing highways, byways and bridges, 
they must also consider whether remaining 
so tethered to the system is a necessity. Might 
it not be time to ask how major investments 
in mass transit could complement, if not start 
to replace, the US reliance on cars? Reflec- 
tive decision-making is, after all, at the core of 
what this informative book recommends. m 


Kyle Shelton is programme manager and 

a fellow at the Kinder Institute for Urban 
Research at Rice University in Houston, Texas. 
e-mail: kyle.k.shelton@rice.edu 


Surfaces, the exhibition reviewed in 
‘Medical modernist’ (Nature 530, 30-31; 
2016), was originally a joint production 

of the Museum of Concrete Art and the 
German Museum of Medical History, both 
in Ingolstadt. 


NATI HARNIK/AP/PRESS ASSOCIATION 


Correspondence 


Technology alone 
won’t save climate 


A dragon was buried at the 
Paris climate meeting (COP21): 
‘climate sceptics’ disappeared. 
Now we face a second, equally 
formidable dragon: unreasonable 
optimism about ‘new energy 
technologies. This optimism 
supports economic-growth 
models driven by innovation, 
but depends on an unimaginable 
scale and rate of deployment. 

Defeating the second dragon 
requires that we reconsider our 
habits of energy usage. Thirty 
years of engine-efficiency 
gains have been eclipsed by our 
preferences for ever-larger cars 
that are often 20 times heavier 
than the passengers — but these 
are habits, not needs. 

We could continue to live 
well in rich economies with, say, 
one-quarter of the energy. For 
instance, we could run the boiler 
for one-quarter of the time and 
quarter our movement of mass 
— the total of all vehicles, freight 
and people, measured in tonne- 
kilometres. We could also make 
buildings and goods with half the 
material (without risking safety) 
and keep them for twice as long. 

‘Success’ today is largely 
associated with derivative 
measures of increasing gross 
domestic product, profitability, 
speed or salary. Yet our value 
systems are based on integral 
measures of quality and stock: 
reputation, heritage, journeys 
and relationships. We need to 
expand the dialogue of climate 
mitigation to reflect these values. 
Challenging our habits of energy 
use should be the first priority of 
climate policy. 
Julian Allwood University of 
Cambridge, UK. 
jma42@cam.ac.uk 


Formalize recycling 
of electronic waste 


India urgently needs a formal 
recycling policy for its mountain 
of electronic waste. Boosted by 
illegally imported discards from 


the West, this waste is expected 
to reach a total of around 
30 million tonnes by 2020. 
Western electronic waste comes 
largely from countries’ weak 
legislation on its handling and 
management (G. Agoramoorthy 
and C. Chakraborty Nature 485, 
309; 2012). Although people 
in India informally recycle an 
estimated 95% of electronic 
waste for profit, the practice 
could soon be overwhelmed. 
India’s government proposed 
draft regulations for this waste 
in June 2015, to be formalized 
after a public consultation. These 
are already proving effective, 
but there is still a pressing need 
for national policy to alleviate 
damage to the environment. This 
would create employment and 
commercial opportunities, address 
health and safety concerns, and 
forge a path towards sustainability. 
Devika Kannan, Kannan 
Govindan University of Southern 
Denmark, Odense, Denmark. 
Madan Shankar Anna 
University, Chennai, India. 
kgov@iti.sdu.dk 


International accord 
on open data 


The accord Open Data in a Big 
Data World has been produced 
by representative bodies of global 
science collaborating as Science 
International (see go.nature.com/ 
tpq3tu). It sets out the principles 
for maximizing benefit from the 
digital data revolution in shaping 
the future conduct of science. 

Openness is the bedrock 
for benefit. Whole science 
systems, not merely the habits of 
researchers, need to adapt. It will 
be necessary for public funders 
of research to fund open-data 
management, for publishers 
to ensure that open data are 
deposited concurrently with the 
publication of derived scientific 
claims, for disciplinary societies 
to debate how their disciplines 
should adapt, and for universities 
to create incentives and support 
for open-data processes. 

The accord recognizes 


potential pathologies: that the 
data deluge could overwhelm 
the open scrutiny of scientific 
claims, and that a countervailing 
trend towards privatization of 
knowledge could be at odds with 
the ethos of scientific inquiry 
and our need to use ideas freely. 
It is crucial that standards of 
reproducibility are re-established 
for a data-rich age, and that the 
global scientific community 
commits to “intelligently open” 
science (see go.nature.com/ 
dvgdfo). Digital technologies also 
provide a route to open science 
and open knowledge, where all 
sectors of society are involved in 
the co-design and co-production 
of actionable knowledge. 
Geoffrey Boulton ICSU CODATA; 
and University of Edinburgh, UK. 
g.boulton@ed.ac.uk 


Control wildlife 
pathogens too 


Policies to control diseases caused. 
by invasive alien species should 
be extended to cover endangered 
wild species, ecosystems and 
their services — not just humans, 
livestock and cultivated plants. 

Of the 100 invasive alien 
species listed by the International 
Union for Conservation of Nature 
as the ‘world’s worst, one-quarter 
have environmental impacts that 
are linked to diseases in wildlife 
(M. J. Hatcher et al. Front. Ecol. 
Environ. 10, 186-194; 2012). 
Identifying and managing this 
threat calls for coordinated 
interdisciplinary expertise. 

Priorities are to collect baseline 
information on the distribution 
and population dynamics of 
pathogens, hosts and vectors; to 
determine the relative importance 
of invasion pathways; and to 
develop methods for predicting 
host shifts, pathogen-host 
dynamics and the evolution 
of alien pathogens (see also 
go.nature.com/ux4wpp). 

This integrated strategy is 
geared towards the goals set by 
the Convention on Biological 
Diversity for managing invasives. 
Helen Roy* NERC Centre 


for Ecology and Hydrology, 
Wallingford, UK. 

hele@ceh.ac.uk 

*On behalf of 4 correspondents (see 
go.nature.com/upyjwi for full list). 


Weapons plutonium 
riskier above ground 


Cameron Tracy and colleagues 
argue that the US Department of 
Energy (DOE) should conduct 

a new Safety assessment of its 
nuclear Waste Isolation Pilot 
Plant (WIPP) before loading it 
with 34 tonnes of plutonium from 
dismantled nuclear weapons 
(Nature 529, 149-151; 2016). We 
contend that the long-term risk 
of disposal in WIPP should be 
balanced against the benefits. 

The reanalysis that the 
authors propose would take 
several years. During this time, 
Congress could well abandon the 
disposal programme. Leaving 
the plutonium above ground 
indefinitely would pose a much 
greater environmental threat than 
disposing of it in WIPP. 

Disposal in WIPP would also 
offer a cheaper, simpler and more 
secure alternative (see go.nature. 
com/h81nb5) to the unaffordable 
plan to convert the plutonium to 
mixed oxide with uranium and 
burn it in nuclear-power plants. 

There is a way to keep the 
disposal programme moving. The 
DOE proposes to put 6 tonnes of 
excess weapons-usable plutonium 
in WIPP from its Savannah River 
site. This will not markedly affect 
the WIPP inventory. Packaging 
that plutonium for disposal will 
take about 6 years (see go.nature. 
com/qhl8uf). Meanwhile, the 
DOE could redo the WIPP 
safety analysis and evaluate 
other disposal options for the 
34 tonnes of plutonium (see, for 
example, go.nature.com/2cik4o), 
which will remain in bunkers at 
Savannah River and in Texas until 
a solution is found. 

Edwin Lyman Union of Concerned 
Scientists, Washington DC, USA. 
Frank von Hippel Princeton 
University, New Jersey, USA. 
fvhippel@princeton.edu 
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OBITUARY 


Marvin L. Minsky 


(1927-2016) 


A founding father of artificial intelligence. 


arvin Lee Minsky had no patience 
Me those who doubted that com- 

puters could be intelligent at a 
human level or beyond. In the late 1950s, 
building on the work of Alan Turing, along 
with computer scientists John McCarthy, 
Herbert Simon and Allen Newell, Minsky 
started the work that led everyone to think 
of this group as the founders of the field of 
artificial intelligence (AI). Were it not for 
their determined advocacy, AI might have 
foundered. 

Minsky, who died on 24 January, was born 
in New York City in 1927. After serving in 
the US Navy in the Second World War, he 
earned a degree in mathematics in 1950 
from Harvard University in Cambridge, 
Massachusetts, where he impressed the 
mathematician Andrew Gleason by prov- 
ing fixed-point theorems in topology. Dur- 
ing his doctorate on learning machines, at 
Princeton University in New Jersey, he built 
one out of vacuum tubes and motors. 

When Minsky finished his PhD, the 
eminent mathematicians John von 
Neumann, Norbert Wiener and Claude 
Shannon all recommended him for appoint- 
ment as a junior fellow at Harvard. During 
this time, he became curious about how the 
brain works, but was frustrated by the limi- 
tations of conventional microscopy, which 
could not provide clear images of thick, 
light-scattering neural tissue. This led to his 
invention of the confocal scanning micro- 
scope, which uses lenses to focus light on 
successively small volumes. 

In the late 1950s, with McCarthy, he 
founded a group that became the Artifi- 
cial Intelligence Laboratory at the Massa- 
chusetts Institute of Technology (MIT) in 
Cambridge. In 1961, Minsky published his 
famous paper ‘Steps toward artificial intelli- 
gence’ (M. Minsky Proc. IRE 49, 8-30; 1961), 
a call to arms for a generation of researchers. 
Scientists flocked to Minsky’s laboratory to 
take on the challenge of understanding intel- 
ligence and endowing computers with it. 
They benefited from Minsky’s wisdom and 
enjoyed his insights, lightning-fast analyses 
and impish repartee. 

His students felt part of a scientific revolu- 
tion. They helped Minsky to develop high- 
level theories about how programs could 
recognize structures made of toy blocks, 
answer questions about stories written for 
children, learn something definite from indi- 
vidual examples and exhibit common sense. 


His laboratory was an egalitarian utopia. 
He didn’t notice looks, gender, age or sta- 
tus. He cared only about ideas and ability. 
Minsky and his wife, Gloria, often welcomed 
students into their home, where several 
pianos stood as a reminder that Minsky was 
a musical prodigy, able to improvise fugues. 

Minsky’s attention span was short. When- 
ever I explained an idea to him, he would leap 
ahead of me, having worked the whole thing 
out after a few sentences. Once, I suggested 
that if we ever developed really intelligent 
machines, we should do a lot of simulation 
before we let them loose in our world to be 
sure they weren't dangerous. “And we're the 
simulation?” he said, guessing my punchline. 
“Tt isn’t going very well, is it?” 

His laboratory built pioneering robots 
as well as revolutionary programs. Minsky 
invented a robot arm with 14 degrees of 
freedom. He argued that space exploration 
and nuclear-material processing would be 
simpler with manipulators driven locally by 
computers or remotely by human operators. 
He foresaw that microsurgery would be done 
by surgeons by using telepresence systems. 

In the late 1960s, Minsky and MIT math- 
ematician Seymour Papert worked on the 
mathematics of perceptrons — simple neu- 
ral networks — showing what they could 
and could not do, which raised the sophis- 
tication of research on neurally inspired 
mechanisms to a higher level. Minsky and 
Papert collaborated into the 1970s and 
early 1980s, developing theories of intel- 
ligence and radical approaches to early 
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education that centred on teaching children 
to program using the Logo language. 

In the mid 1970s, Minsky introduced 
‘frames, a way of describing entities and 
situations using a template-like representa- 
tion. A frame describing a birthday party, 
for example, would have a slot for the person 
celebrated, that person's age and a list of the 
gifts presented, along with slots for time and 
place inherited from a ‘celebration’ frame. He 
also developed the idea of knowledge lines 
(K-lines) to address questions about how 
information is represented, stored, retrieved 
and used in the brain. He argued that K-lines 
help us to solve problems by putting us back 
into mental states that resemble those we were 
in when we previously thought about similar 
problems. 

In 1985, he brought these and many other 
ideas together in a book, The Society of Mind 
(Simon & Schuster). He wrote that intelli- 
gence emerges from the cooperative behav- 
iour of multiple agents, none of which is 
intelligent. Then, in 2006, Minsky published 
The Emotion Machine (Simon & Schuster), a 
book about intelligence, creativity, emotion, 
consciousness and common sense. Multi- 
plicity is a dominant theme. He noted, for 
example, that concepts such as intelligence 
are ‘suitcase words, into which one can 
stuff multiple meanings. He wrote that our 
resourceful intelligence arises from multiple 
ways of thinking on multiple levels, and from 
multiple ways of representing knowledge. 

In recent years, Minsky found it ironic that 
the doubters of the possibility of AI have been 
replaced by worriers about its consequences. 
He didn't see a technical advance that would 
justify the change in attitude, attributing 
recent successes in AI to faster computers. He 
thought that not much real progress had been 
made in the field for several decades, but he 
had, nevertheless, no doubt that our species’ 
greatest legacy will be the intelligent comput- 
ers that we create. 

Minsky’s talks, papers and books are like 
diamond mines. The riches will take decades 
to cut and polish, inspiring researchers for 
decades to come. m 


Patrick Henry Winston is professor of 
artificial intelligence at the Massachusetts 
Institute of Technology, Cambridge, 
Massachusetts, USA. He was a graduate 
student of Marvin Minsky’ in the 1960s, and 
thereafter an admiring friend and colleague. 
e-mail: phw@mit.edu 
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QUANTUM PHYSICS 


Photons paired with phonons 


The force exerted by light on an object has been used to pair photons with quantum units of mechanical vibration. This paves 
the way for mechanical oscillators to act as interfaces between photons and other quantum systems. SEE LETTER P.313 


MILES BLENCOWE 


uantum entanglement is a bizarre state 

in which it is meaningless to describe 

the properties of individual objects ina 
collection; only the properties of the collection 
as a whole may be described. On page 313 of 
this issue, Riedinger et al.” report the quantum 
pairing of light and vibrations of microscopic 
mechanical oscillators comprising more than 
10” atoms — large for a quantum object. This 
is a big step towards the goal of using light 
to achieve the quantum entanglement of the 
vibrational motion of two widely separated 
mechanical oscillators, aiding the development 
of quantum-information processing systems 
that have practical applications. 

Riedinger and colleagues exploit the fact 
that light shining on an object exerts a force’. 
If the object is a freely suspended wire that is 
clamped at both ends, such asa silicon nano- 
beam, then an incident light pulse will set it 
vibrating, like tapping a bell with a hammer. 
The silicon nanobeam used in the authors’ 
experiments was about 15 micrometres long, 
500 nm wide and 250 nm thick, and was engi- 
neered so that a fraction of the incident light 
from a near-infrared laser source could be 
trapped in a segment of the nanobeam. 

This segment functions like an optical cavity 
(a system used to trap light at certain frequen- 
cies known as resonances), in which the cavity 
length is comparable to the wavelengths of both 
the light and ofa particular vibrational mode 
of the nanobeam. The force exerted by the light 
is considerably enhanced’ by co-locating the 
optical cavity and the vibrating regions, rather 
than letting the light ‘tap’ the nanobeam from 
the outside. The mechanical-vibration mode 
is driven from within by the trapped light, 
and manifests as a ‘breathing’ mode: rapidly 
alternating expansions and contractions of the 
beam's width at about 5.3 gigahertz. 

At the very low light intensities used in 
Riedinger and co-workers’ experiments, a 
quantum description of the system is neces- 
sary in which the light consists of photons 
and the vibrational energy of the nanobeam 
comes only in discrete lumps called phon- 
ons. Describing the action of the light force 
at the quantum level, an incident photon can 
emit and hence create a phonon only if the 
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Figure 1 | A photon-phonon interface. a, Riedinger et al.’ shone pairs of light pulses on a microscopic 
mechanical oscillator (a silicon nanobeam) that also incorporated an optical cavity — a section with 
different-sized holes that traps standing waves of light at a resonant frequency. The first light pulses in 
each pair were blue-detuned (their energy is slightly higher than the resonance frequency of the optical 
cavity) and could induce the formation of a single phonon (a quantum unit of vibration) in the oscillator 
and produce a scattered photon at the resonance frequency. b, The second pulses were red-detuned 
(with slightly lower energy than the resonance frequency) and could absorb the single phonon from 

the nanobeam, again generating a scattered photon at the resonance frequency. By measuring the joint 
probabilities of scattered photons produced from the two types of light pulse, the authors established that 
the number of photons in the cavity correlates with the number of phonons. 


energy lost by the resulting scattered photon 
corresponds to the resonance frequency of the 
cavity’ (Fig. 1a). Similarly, an incident photon 
can absorb and hence annihilate a single pho- 
non only if the energy of the resulting scattered 
photon corresponds to the resonance fre- 
quency of the cavity (Fig. 1b). Higher-energy 
incident photons that can emit phonons are 
said to be blue-detuned with respect to the 
cavity’s resonance frequency, whereas lower- 
energy incident photons that can absorb 
phonons are red-detuned. 

Riedinger et al. cooled the silicon nanobeam 
to a few hundredths of a kelvin, and verified 
that the vibrational breathing mode was in its 
quantum ground state, to a good approxima- 
tion — the breathing mode did not contain 
even one phonon for about 97% of the time. 
They then fired a long train of pairs of laser 
light pulses (a few tens of millions) at the nano- 
beam, for which the intensity of the individual 
pulses was sufficiently low, and the intervals 
between the pulse pairs were sufficiently long, 
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to allow the nanobeam breathing mode to 
relax to its quantum ground state before the 
arrival of the next pulse pair. 

The first pulse in a given pair was blue- 
detuned, enabling the pulse to ‘write’ a phonon 
into the nanobeam breathing mode, whereas 
the second pulse was red-detuned, potentially 
allowing it to ‘read’ a phonon out of the mode. 
The authors tuned the time interval between 
a given pulse pair, from a delay of one-tenth 
of a microsecond to a few microseconds, to 
determine how long the breathing mode could 
store a single phonon for. The scattered light 
produced from the nanobeam was split into 
two, and each half was directed to a different 
single-photon detector; each detector regis- 
tered a voltage pulse only for a scattered pho- 
ton that had written or read a phonon. Having 
two single-photon detectors enabled the detec- 
tion of two scattered ‘write’ or ‘read’ photons 
resulting from the same light pulse. The prob- 
ability of such detection events is extremely 
low because of the inherent weakness of the 


GROBLACHER LAB/ASPELMEYER GROUP 


THE TRUSTEES OF THE BRITISH MUSEUM 


photon force and the low photon-detection 
efficiencies, which is why so many pulse pairs 
were needed. 

Riedinger and colleagues observed that the 
joint probability of detecting a scattered ‘write’ 
photon anda subsequent scattered ‘read’ pho- 
ton significantly exceeds the joint probabili- 
ties of detecting two scattered ‘write’ photons 
or two scattered ‘read’ photons for the same 
pulse pair. This inequality, together with meas- 
urements of the latter two joint probabilities, 
provides strong evidence that the blue-detuned 
‘write’ pulse puts the experimental system 
into a correlated quantum state, in which the 
number of photons in the cavity is always 
paired with the same number of phonons in 
the mechanical breathing mode. In particu- 
lar, when the cavity is in the vacuum state (no 
scattered ‘write’ photons detected), then the 
mechanical mode must also be in the quantum 
ground state (no phonons). And for the rarer 
situation in which the cavity contains a single 
photon (one scattered ‘write’ photon detected), 
then the mechanical mode must also contain a 
single phonon. This photon—-phonon pairing 
is the key result of the experiment. 

The authors find that the correlated quantum 
states persisted for up to about 1 us, well 
short of the time taken for the vibrations of 
the breathing mode to dissipate (a few tens 
of microseconds). This might be because 
the nanobeam heats up during each ‘write’- 
pulse stage. The storage lifetime of phonons 
might be lengthened by reducing the pulse 
intensities. 

Having convincingly demonstrated a 
photon-phonon interface, a striking next step 
would be to generate an entangled quantum 
state that involves a single, breathing-mode 
phonon at the micrometre-wavelength scale 
on two silicon nanobeams separated from 
each other by up to 1 m or more. This could be 
achieved by bringing together ‘write’ photons 
scattered from both nanobeams and allow- 
ing them to interfere before being detected’. 
While the entangled state survives (possibly for 
up to a few microseconds), it would be mean- 
ingless to ascribe the phonon to one of the 
nanobeams and the vibrational ground state 
to the other. All that could be said is that the 
two nanobeams collectively possess the single 
phonon of vibrational energy — a bizarre state 
of affairs indeed. Such an entangled state has 
previously been demonstrated, but with much 
shorter-lived, higher-frequency (tens of tera- 
hertz) phonons’. 

The ability to couple gigahertz and lower- 
frequency mechanical quantum-vibrational 
motion to other quantum systems (consisting 
of, for example, a few atoms, electrons or 
microwave photons) would allow nano- 
mechanical resonators to serve as versatile 
interfaces that facilitate the transfer of quantum 
states between light and these other systems’. 
Together with the ability of light to trans- 
mit quantum states over large distances, this 


would enable entanglement to be distributed 
between widely separated quantum systems 
— which would be useful for quantum 
information-processing applications. m 
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NO 


gods in 


human civilization 


Cross-cultural experiments find that belief in moralistic, knowledgeable and 
punishing gods promotes cooperation with strangers, supporting a role for 
religion in the expansion of human societies. SEE LETTER P.327 


DOMINIC D. P. JOHNSON 


ments, courts and the police to deter and 

punish those who would otherwise under- 
mine social cooperation. But how did human 
societies achieve and sustain cooperation 
before these institutions existed? One possi- 
bility is religion: under the watchful gaze of 
supernatural agents, people modify their 
behaviour in an effort to avoid the wrath of the 
gods. In this issue, Purzycki et al.' (page 327) 


lE the modern world, we rely on govern- 


report a cross-cultural field-study finding 
that people are consistently more willing to 
give money to strangers of the same religion if 
the donor believes in a god that is moralizing 
(concerned about good and bad behaviour), 
knowledgeable (aware of one’s thoughts and 
actions) and punishing (able to exact harm). 
Pioneering anthropologists, such as Emile 
Durkheim and Bronistaw Malinowski in the 
early twentieth century, have long argued that 
supernatural beliefs offer a powerful way to 
build materially cooperative societies. But in 


Figure 1 | Weighing of the heart. This papyrus manuscript, a detail from the ancient Egyptian 

‘Book of the Dead called Papyrus of Ani, depicts a scene in which the dead Ani's heart is weighed against 
a feather, representing Maat, goddess of truth and justice. At the top of the scene are the great Egyptian 
gods, ready to pronounce judgment on whether Ani should be granted entrance to the afterlife or 
banished to the underworld. 


18 FEBRUARY 2016 | VOL 530 | NATURE | 285 


© 2016 Macmillan Publishers Limited. All rights reserved 


| RESEARCH | NEWS & VIEWS 


50 Years Ago 


“The future of nuclear power’ — The 
advanced type of gas-cooled reactor 
was expected in 1964 to cost less 
than a coal station of the same date, 
and within a year of the delivery of 
this Lecture the firm tender prices 
for the Dungeness B nuclear station 
showed that this was, in fact, the 
case, provided the station is built 

at the tender price. In the United 
States similar dramatic falls in 

costs have been experienced with 
their water moderated reactors; 
and Canada’s heavy-water reactor 
is expected to have very low fuel 
costs, although it will have a high 
capital cost. These types of reactors, 
by the end of the century, would be 
using 100,000 tons of uranium per 
annum, on reasonable assumptions 
as to the rate of development of 
nuclear stations. 

From Nature 19 February 1966 


100 Years Ago 


The memorandum regarding 

the neglect of science to which 
you refer in your leading article 
last week fails in my judgment 

by its moderation. The proposal 
that at least as many marks in 

the Civil Service examinations 
shall be allotted to science as to 
classics, may be a step in the right 
direction, but it is a halting one... 
The revelations that have come to 
light in the course of this bloody 
war will, we hope, do at least this 
good, that the people may be 
induced to appreciate the necessity 
of basing education upon natural 
science instead of upon the classics. 
The appointment ofa Minister 

of Science which is advocated in 
the memorandum would under 
existing conditions be of little use. 
Whatever qualifications he might 
be selected for, we may safely 
prophesy that entire ignorance 

of the subject he is to administer 
would be one. 

From Nature 17 February 1916 


the thriving new field of evolutionary religious 
studies, researchers are drawing on evolution- 
ary theory to explore how religious beliefs can 
bring adaptive advantages — that is, contri- 
bute to an individual's survival or reproductive 
success. Although major debates remain’, one 
theory that has gathered momentum is that a 
belief in supernatural punishment for violating 
social norms may be adaptive’ (Fig. 1). 

How could this idea apply to cooperation? 
Deterring oneself from the pursuit of self- 
interest because of the risk of punishment 
from a watchful supernatural eye would seem 
to reduce an individual’s evolutionary fitness, 
and should thus be eliminated by natural selec- 
tion. However, even if such beliefs are false and 
costly, they may have generated net benefits: 
to individuals, by steering them away from 
selfish behaviour that risked retaliation in 
increasingly transparent and gossiping human 
societies; and/or to groups, by increasing 
the performance of the group as a whole in 
competition with other groups*”. 

But what evidence do we have for such a 
theory? Empirical evidence that supernatural 
beliefs promote cooperation is mounting, 
but has tended to rely on qualitative, soci- 
ety-level or proxy measures of beliefs. Study 
participants have also typically been university 
students in developed nations, thus omitting 
the small-scale societies most relevant to the 
evolutionary problem at hand: how human 
groups achieved cooperation and made the 
transition from small to large societies in the 
first place. Perhaps the most important lacuna 
is that previous studies have not rigorously 
addressed whether the beliefs of the recipients 
of cooperative acts changes people's generosity 
towards them. 

Purzycki and colleagues’ study addresses 
many of these issues by using controlled exper- 
imental games among participants from eight 
small-scale societies around the world and 
tying the results to explicit measures of indi- 
viduals’ beliefs. Participants played a simple 
but clever game (designed to subtly reveal 
preferences), in which they allocated coins 
between a distant co-religionist (people who 
were members of the same religion, but who 
lived geographically far away) and either them- 
selves or a local co-religionist. The researchers 
found that the more subjects rated their god 
as moralistic, knowledgeable and punishing, 
the more money they gave to distant strangers 
adhering to the same religion. Notably, belief 
in rewards from the god could not account 
for the results — supernatural punishment 
seemed responsible. 

Because the study is correlational, one 
worry is that some unexamined variable could 
account for the results — perhaps certain 
people are disposed to both kindness to 
strangers and belief in punitive gods, for 
example. However, Purzycki et al. show that 
allocations increased for moralistic gods that 
were punishing and knowledgeable, but not 
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for more locally relevant supernatural agents 
that were also punishing and knowledgeable. 
Hence, general conceptions of supernatural 
agents cannot alone explain the results. Rather, 
itis moralistic, ‘big’ gods that seem to stimulate 
generosity towards distant co-religionists®. 

The authors did not conduct experiments 
to assess allocations to oneself versus a local 
co-religionist, nor experiments involving non- 
religious recipients, so we don’t know whether 
local supernatural agents might promote 
cooperation between individuals within the 
local community, as other work has found’, or 
whether any kind of god promotes coopera- 
tion with strangers of another, or no, religion. 
Purzycki et al. focused on cooperation with 
co-religionists beyond the local community, 
and thus the expansion of human society from 
small to large groups. But future studies of the 
role of local gods are needed to improve our 
understanding of the evolutionary origins of 
religion (before there were big groups or big 
gods), and of whether and how religion brings 
adaptive advantages to individuals*. 

It is worth emphasizing that the subjects 
in this experiment were not cooperative with 
random strangers, only with strangers that 
shared the same god. We therefore still face 
the challenge of understanding the promotion 
of cooperation and trust among members of 
different religions. Purzycki and colleagues’ 
finding that sharing the same god is key to 
cooperation suggests that this may be an even 
harder nut to crack. In fact, one of the most 
compelling explanations for why individu- 
als may help the group at their own expense 
is that it aids survival in an environment of 
inter-group competition. Whenever the threat 
of exploitation or warfare is present, the best 
protection is larger and more-cohesive soci- 
eties, which are better able to deter or defeat 
rivals. Religion’s positive role in reducing 
self-interest and promoting cooperation may 
therefore reflect the costs of competition as 
much as the benefits of generosity”. 

Religion is arguably the most powerful 
mechanism that societies have found to bind 
people together in common purpose. From 
ancient civilizations, to the spread of Christi- 
anity, to today’s Islamist terrorist groups, reli- 
gion has motivated not only the subordination 
of self-interest for the wider group, but even 
martyrdom in the name of a god. We are still 
grappling to understand, from a scientific per- 
spective, why and under what circumstances 
humans sacrifice their own welfare for the 
benefit of distant others’’. But there is little 
doubt about the power of religion to promote 
allegiance to one’s god and group. Purzycki 
and colleagues’ study offers the most explicit 
evidence yet that belief in supernatural pun- 
ishment has been instrumental in boosting 
cooperation in human societies. A large part 
of the success of human civilizations may have 
lain in the hands of the gods, whether or not 
they are real. m 
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Ice streams waned 
as ice sheets shrank 


It emerges that ice discharge from a major ice sheet did not increase rapidly at the 
end of the most recent ice age. The finding points to steady, not catastrophic, 
ice-sheet loss and sea-level rise on millennial timescales. SEE LETTER P.322 


JASON P. BRINER 


a tug of war for water in which climate 

change dictates which side gains or loses 
ground. Mindful that tug-of-war contests often 
end with the catastrophic collapse of one side, 
climate scientists are deeply concerned about 
the manner in which ice sheets are currently 
declining’. This concern is largely focused on 
mechanisms that amplify ice loss, such as the 
acceleration of ice streams — massive rivers 
of ice that drain a disproportionately large 
amount of ice from ice sheets (Fig. 1). On 
page 322 of this issue, Stokes et al.’ present the 
first complete reconstruction of ice-stream 
activity throughout the disintegration of an 
ice sheet. 

A fleet of geoscientists scattered across the 
poles, along with satellites in space, watches 
every move that ice streams make in Green- 
land and the Antarctic’. This work matters 
because ice streams are the valves that control 
the enormous volumes of ice poised to spill 
into the oceans and cause a rapid jump in sea 
level. But the duration of these observations 
(tens of years) has been short compared with 
the time that it takes for substantial ice-sheet 
changes to occur (hundreds to thousands of 
years). 

One way to gauge ice-stream activity on 
longer timescales is to use the geological 
record — in sediments and landforms — of 
past ice streams that left their imprint on 
formerly glaciated landscapes. Over the past 
two decades, Stokes and his co-workers have 
greatly advanced the methods used to assem- 
ble scraps of evidence left behind by extinct ice 
streams and to generate histories of ice-stream 
activity®. The present study builds on this foun- 
dation and on the established chronology of 
ice-sheet positions through time’. 


LE sheets and the global sea are locked in 


Stokes et al. evaluate the importance of ice 
streams at the end of the most recent ice age, 
when a tug of war also played out between ice 
sheets and the ocean. Their evidence shows 
that ice streams turned on and off, and shifted 
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from place to place, during the disappearance 
of the Laurentide Ice Sheet — the Antarctic- 
sized ice sheet that occupied Canada and the 
northern United States at that time. Perhaps 
most notably, Stokes and colleagues find that 
ice-stream activity decreased as the planet 
warmed: the number of ice streams fell, the 
amount of ice expunged by them decreased 
and ice streams occupied a progressively 
smaller percentage of the ice-sheet edge. 

The authors’ findings represent a leap for- 
ward in our view of ice-stream activity on 
timescales longer than a few decades. Until 
now, we were in the dark about howice streams 
respond to ice-sheet decay. Would excessive 
ice streaming lower the elevation of ice sheets, 
thus robbing ice-accumulation centres of their 
elevated positions (which are good for gath- 
ering snow that compresses to form ice), and 


Figure 1 | An Antarctic ice stream. Fast-moving rivers of ice, such as the Byrd Glacier (pictured) are 
known as ice streams, and drain a disproportionately large amount of ice from ice sheets. Stokes et al.° 
report a complete reconstruction of ice-stream activity throughout the disintegration of the ancient 

Laurentide Ice Sheet. 
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triggering catastrophic ice-sheet collapse? 
Does the additional meltwater that forms dur- 
ing periods of warming lubricate ice streams, 
causing them to discharge ice faster for pro- 
longed periods? Stokes and co-workers’ results 
suggest not. 

However, the relevance of these findings to 
future ice-sheet behaviour is not totally clear, 
because the Laurentide Ice Sheet is not an 
exact analogue of today’s ice sheets. For exam- 
ple, much of the Laurentide (including the ice 
streams embedded in it) terminated on land, 
whereas ice streams within the Greenland and 
Antarctic ice sheets terminate in the sea. Fur- 
thermore, present-day ice streams are largely 
‘fixed’ in space by the mountains through 
which they pass’, and therefore could flow for 
thousands of years. This differs from the many 
Laurentide ice streams that were not confined 
by the underlying landscape, and thus were 
typically more ephemeral. 

Predicting the pace of future ice loss and sea- 
level rise is an enormous challenge. With their 
improved sophistication, numerical ice-sheet 
models have led to great strides in our under- 
standing of how quickly ice sheets may vanish 
in the future, particularly marine-based ice 
sheets such as the West Antarctic Ice Sheet*”. 
Stokes and co-workers’ strategy of extracting 
data from relict ice-age landscapes provides a 
new viewpoint. As the Greenland Ice Sheet, 
and particularly the East Antarctic Ice Sheet, 
eventually retreat from the ocean, the Lauren- 
tide Ice Sheet becomes more closely analogous 
to them. Continued effort is needed to discover 
other secrets hidden in the palaeo-record of 
ice-sheet response to past climate change. 

The current findings may not provide 
guidance about ice-stream changes during 
this century, but they will help us to predict the 
pace of long-term ice-sheet disappearance. And 
although ice sheets will inevitably lose the cur- 
rent tug of war, it is reassuring to know that the 
competition might not end catastrophically. = 
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CELL BIOLOGY 


Form follows function 
for mitochondria 


The fission of organelles called mitochondria has now been linked to the 
stress-sensor protein AMPK. When activated by stress, this protein phosphorylates 
the mitochondrial receptor protein MFF, which recruits the fission machinery. 


CHUNXIN WANG & RICHARD YOULE 


he architect Louis Sullivan wrote that, 

for all things, “form ever follows func- 

tion” In keeping with this rule, energy- 
producing organelles called mitochondria 
exist in different forms in different cell types’” 
and under different conditions’. Long, con- 
tiguous assemblies of mitochondria promote 
energy production, whereas stress causes their 
fragmentation into small, round, disconnected 
units. Mitochondria switch between these two 
forms through rounds of fission and fusion’. 
Writing in Science, Toyama et al.” identify a sig- 
nalling pathway that triggers the fission of long 
mitochondrial assemblies in response to stress. 

The energy status of a cell can be measured 
by the ratio of AMP molecules (an end product 
of chemical-energy expenditure) to energy- 
carrying ATP molecules in the cytoplasm. 
Mitochondria produce ATP in response to 
energy depletion. One sensor of energy deple- 
tion is an evolutionarily conserved protein 
known as AMP-activated protein kinase 
(AMPK). This protein is activated by bind- 
ing AMP or by stresses that deplete ATP and 
increase the AMP:ATP ratio, such as glucose 
deprivation, inadequate blood supply and lack 
of oxygen’. 

Activation of AMPK enhances the pro- 
duction of mitochondria and improves 
endurance — indeed, the use of drugs that 
activate AMPK, such as AICAR, is banned 
from competitive sports by the World 
Anti-Doping Agency. AMPK phosphoryl- 
ates several substrates°, thereby regulating 
metabolism, mitochondrial proliferation and 
an intracellular degradation process called 
autophagy. However, previous studies have 
not demonstrated whether AMPK directly 
regulates mitochondrial form. 

Researchers have known for years’ that 
mitochondria undergo fission and become 
fragmented in response to poisons such as cya- 
nide, which damage the organelles by inhib- 
iting the electron transport chain (ETC) that 
drives ATP synthesis. But how such mitochon- 
drial dysfunction is sensed, and how it triggers 
a rapid fission response, has remained unclear. 

Toyama et al. found that AMPK is rapidly 
activated in response to the ETC-inhibiting 
drugs rotenone and antimycin A, presumably 
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as a consequence of an increased AMP:ATP 
ratio. Mitochondria became fragmented in 
drug-treated cells, but this response was pre- 
vented by deletion of the gene that encodes 
AMPK. By contrast, pharmacological acti- 
vation of AMPK with AICAR was sufficient 


DRP1 


g 
¢ 


Figure 1 | The stress of separation. Mitochondria, 
which are the cell’s energy centres, sometimes exist in 
fused assemblies that can be fragmented in response 
to stress. The receptor protein MFF is located on 
the outer membrane of such assemblies, at putative 
separation sites. Toyama et al.” report that, when 
activated in response to stress, the enzyme AMP- 
activated protein kinase (AMPK) phosphorylates 
(P) MFE, which then recruits the protein Dynamin- 
related protein 1 (DRP1) to the membrane. DRP1 
forms constricting spiral complexes around the 
mitochondria, mediating fission. 


to cause fragmentation, demonstrating a 
surprising link between activation of this 
sensor protein and mitochondrial fission. 

How does AMPK relay signals to the mito- 
chondrial fission machinery? The core 
mitochondrial fission factor is the enzyme 
Dynamin-related protein 1 (DRP1), which 
wraps around constriction sites at the mito- 
chondrial membrane, forming spiral com- 
plexes that mediate scission. DRP1 is primarily 
located in the cytoplasm*”, but is recruited 
to prospective mitochondrial scission sites 
by receptor proteins such as MFF (ref. 10). 
Toyama and colleagues showed that AMPK 
directly phosphorylated MFF at two amino- 
acid residues, serine-155 (S155) and S172, 
consistent with results from a previous study”. 
This modification increased the recruitment of 
DRP1 to mitochondria (Fig. 1). Furthermore, 
mutations that blocked MFF phosphorylation 
prevented poison-mediated mitochondrial 
fragmentation and DRP1 recruitment. 

Thus, the authors have identified a mecha- 
nism that regulates mitochondrial form, and 
probably function. They propose that the 
increased fission induced by AMPK facilitates 
selective elimination of the damaged mito- 
chondria through autophagy, which may act 
as a quality-control response to the poisons. In 
support of this suggestion, mitochondrial fis- 
sion has previously been linked to engulfment 
and elimination of damaged mitochondria 
by autophagosome structures, which mediate 
autophagy”. Therefore, the authors’ proposal 
is a plausible explanation for their observa- 
tions, and deserves further study. 

Although glucose deprivation or starva- 
tion can also increase the AMP:ATP ratio and 
activate AMPK, starvation does not lead to fis- 
sion, but actually inhibits the process, causing 
mitochondria to elongate'*"*. How different 
stimuli that activate AMPK can lead to distinct 
mitochondrial responses will be an interesting 
avenue for exploration. Perhaps, for instance, 
differential modulation of DRP1 phosphoryla- 
tion controls these varied responses. There is 
evidence’ that starvation of cells causes 
DRP1 phosphorylation at S637. It has also 
been reported* that AMPK activation indi- 
rectly phosphorylates DRP1 at S637, and this 
phosphorylation has been linked to the inhibi- 
tion of DRP1 and a decrease in mitochondrial 
fission’. 

Although it is not clear how the activation 
of AMPK by different energetic stressors yields 
opposing effects on mitochondrial morphol- 
ogy, it seems apparent that damage is distin- 
guished from energetic demand, leading to 
fission and quality control, rather than fusion 
and energy production. The pathway identified 
by Toyama et al. highlights the importance of 
fine-tuning mitochondrial-fission responses to 
different stimuli. Because form follows func- 
tion, the authors’ study implies that AMPK 
works in the context of other cellular signals to 
promote a variety of mitochondrial functions. m 
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Females have 


a lot of guts 


The discovery of sex-biased proliferation in the intestinal stem cells of fruit-fly 
midguts reveals that the organ’s size is determined by a previously undefined, 
sex-specific molecular pathway. SEE LETTER P.344 


JUSTIN FEAR & BRIAN OLIVER 


ithin-species genetic differences 
are central to many biological phe- 
nomena, from evolution to disease 


susceptibility. One of the most under-studied 
intraspecies differences is sex. We tend to think 
of sex in terms of characteristics that are related 
to reproduction, but differences between sexes 
extend to many parts of the body. For instance, 
the midgut (an absorptive organ similar to 
the small intestine) of the fruit fly Drosophila 
melanogaster is longer in females than in 
males — especially after mating, when females 
produce many eggs that are replete with pro- 
teins and lipids’. On page 344 of this issue, 
Hudry et al.’ report that a previously unidenti- 
fied branch of the sex-determination pathway 
underlies this dynamic difference in organ size, 
by controlling the proliferation of stem cells in 
the midgut. 

The classic view of sex determination in 
fruit flies involves a regulatory cascade in 
which genes are spliced into different forms 
in a sex-specific manner’. These early steps in 
fly sex determination differ from those in the 
mammalian set-up, but the ultimate outcomes 
are similar. In flies with two X chromosomes, 
the protein Sex lethal (Sxl) splices an RNA 
called transformer (tra) into a protein-coding 
isoform. The TRA protein, acting with its 
cofactor TRA2, binds to both the RNA pro- 
duced from the doublesex (dsx) gene, splicing it 
into a female-specific isoform, and to the RNA 
of fruitless (fru), blocking production ofa male- 
specific isoform. In XY flies, the gene Sx/ is not 
expressed. As such, dsx and fru are spliced into 


male-specific isoforms by default. Through 
another pathway, Sxl prevents dosage com- 
pensation in females (in males, transcription 
of genes on the X chromosome is upregulated, 
to compensate for the fact that females have 
two copies of X). 

Hudry et al. investigate sex-specific dif- 
ferences in gene expression in the fruit-fly 
midgut. They report that genes involved in 
cell division are preferentially expressed in 
females. Furthermore, they find that Sxl acts to 
enhance the proliferative capacity of intestinal 
stem cells in the female midgut. These differ- 
ences from the male midgut help to explain 
the larger, more-plastic guts found in females. 

The authors go to great lengths to demon- 
strate that these sex-specific differences are 
not regulated by the classic splicing pathway. 
They rule out a role for dsx and fru in female 
gut growth and plasticity and provide evi- 
dence that TRA2 might also be dispensable. 
However, there is a caveat to this sugges- 
tion — although mutation of tra2 had no effect 
on proliferation, there was still TRA2 activity 
in these flies. By contrast, tra expression is 
required for proliferation to be enhanced in 
females. Thus, sex differences in the fruit-fly 
midgut are regulated by a previously unidenti- 
fied branch of the sex-determination pathway, 
one that is downstream of Sxl and TRA. 

Plasticity in the female gut is blocked by 
downregulating Sxl, but Hudry and col- 
leagues show that plasticity can be rescued 
by the expression of TRA. These data make a 
convincing case that sex differences in the gut 
are mediated by tra rather than by dosage com- 
pensation. Moreover, misexpression of tra in 
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Figure 1 | Sexand the gut. The midgut of female fruit flies (Drosophila melanogaster) is longer than 

that of males. It contains intestinal stem cells (dark green), which give rise to intestinal progenitors 

(light green), which then differentiate into the mature cell types of the intestine (pink and orange). Hudry 
et al.’ report that proliferation of intestinal stem cells is enhanced in female midguts compared with that 
in males. They find that the protein TRA, which is produced only in females, promotes this sex-specific 
proliferation by modulating expression of the genes reduced ocelli (rdo), Imaginal disc growth factor 1 


(Idgf1), Serpin 88Eb (Spn88Eb) and perhaps others. 


intestinal progenitors enhances proliferation 
in the male midgut. Thus, unlike differentiated 
sex-specific organs such as the male accessory 
gland (the equivalent of the human prostate), 
the sex effects on intestinal stem cells are 
fully reversible. 

Next, the authors look for direct and indirect 
targets of TRA in intestinal stem cells by 
expression profiling. They find 72 genes for 
which RNA splicing or steady-state expres- 
sion levels are modulated by TRA, three of 
which encode proteins that regulate proli- 
feration of the cells (Fig. 1). TRA is thought 
of as an RNA-binding protein and splicing 
factor’, so direct targets would be expected to 
be regulated post-transcriptionally. However, 
in these three targets, TRA modulates tran- 
script abundance. It is possible that these genes 
are indirect targets of TRA and are regulated 
by an unidentified downstream transcription 
factor. Alternatively, TRA might have a direct 
effect on transcript stability, or even on the 
regulation of transcription. 

Cell-autonomous events (those that occur 
on a cell-by-cell basis, rather than affect- 
ing neighbouring cells) have received much 
attention in sex-determination research in 
D. melanogaster, whereas vertebrate work has 
focused mainly on the influence of hormones 
such as testosterone. However, coordinating 
sex-biased expression between cells and organs 
in fruit flies clearly requires broader control. 
Indeed, although tra is required only in the 
intestinal stem cells in which it is expressed, 
two of its three proliferation-regulating tar- 
gets — Imaginal disc growth factor 1 (Idgf1) 
and Serpin 88Eb (Spn88Eb) — encode secreted 
proteins. Idgf proteins are known to increase 
proliferation’ and might bind to receptor 


proteins on the intestinal stem cells from 
which they are secreted to promote prolifera- 
tion directly. Serpin proteins can also bind toa 
variety of protein types, including hormones, 
which they may escort around the body’. 

In addition to having longer guts, female 
fruit flies are larger than males. Recent work’ 
shows that fat-cell expression of tra, but not of 
dsx or fru, contributes autonomously to cell size 
and systemically to female body size, by pro- 
moting insulin secretion from the brain. Idgf1 
and Spn88Ebare highly expressed in fat cells* — 
could they be TRA targets there, too? Of note, 
the protein Serpinb1 mediates plasticity in 
insulin-producing cells in mice’, suggesting 
that this pathway might be evolutionarily con- 
served. There is still much to learn, but the sex- 
biased sizing of organisms and organs seems 
likely to be under the control of an entirely 


unknown branch of the sex-determination 
cascade, with both cell-autonomous and non- 
cell-autonomous functions. 

In a final series of experiments, Hudry 
and colleagues demonstrate that sex-specific 
organ plasticity alters susceptibility to disease, 
because female flies are more sensitive than 
males to tumours that are induced through 
genetic changes. Little is known about sex- 
biased disease susceptibility in D. melano- 
gaster, although sex-specific interactions are 
implicated in a host of disease-related traits’, 
suggesting that sex is an important considera- 
tion for understanding disease in flies. Sex bias 
is seen in many human diseases", including 
several digestive-tract cancers. Furthermore, 
although the authors show substantial sex- 
biased gene expression in the fruit-fly gut, 
many of the studies on digestive-tract tumour 
development in D. melanogaster looked only 
at females. Thus, the current study highlights 
how influences on disease can be missed by 
ignoring sex chromosomes and sexual identity, 
even in non-mammalian model organisms. = 
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From sea to sea 


The genome sequence of the marine flowering plant eelgrass (Zostera marina) 
sheds light on how marine algae evolved into land plants before moving 


back to the sea. SEE LETTER P.331 


SUSAN L. WILLIAMS 


elgrass (Zostera marina) is a member 
of a family of highly specialized, sexu- 
ally reproducing, marine flowering 


plants known collectively as seagrasses. It is 
an unlikely model for plant evolution, but is 
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a useful one because it has undergone major 
habitat shifts: it evolved from marine algae 
into a terrestrial flowering plant, then moved 
back to the sea again. In this issue, Olsen et al.’ 
(page 331) describe the complete genome 
sequence of eelgrass. Their study marks the 
culmination of 8 years’ work by 35 scientists 


STEVE TREWHELLA/FLPA 


from around the world, and should help 
plant biologists not only to dissect how 
eelgrass evolved, but also to gain a bet- 
ter understanding of flowering-plant 
evolution in general. 

Seagrasses have been largely ignored 
by evolutionary biologists, perhaps 
because they are so often out of sight. 
But eelgrass can form vast green 
coastal-sea meadows that support many 
species (Fig. 1), including ecologically 
and economically valuable organisms 
such as halibut, clams and endangered 
sea otters. As such, it is a good model 
system in which to analyse how genetics 
plays into ecosystem functioning — an 
understanding of which is the linchpin 
for human efforts to conserve biological 
diversity in the face of a rapidly chang- 
ing Earth. Moreover, eelgrass is as pro- 
ductive in terms of biomass as are maize 
(corn) and sugar cane, and its root mat 
stabilizes sediments and thus shore- 
lines. Indigenous peoples from the US 
Pacific Northwest to Mexico have built 
cultures around eelgrass, and Europe- 
ans have used it to stuff furniture and to 
insulate homes, and have even grazed 
cattle on intertidal meadows’. 

Eelgrass is unusual from an evo- 
lutionary standpoint, too. The plant 
completed a remarkable feat when 
it readapted from a freshwater to a 
marine lifestyle and became able to 
compete with seaweeds. Its evolutionary 
path began with marine green algae, which 
evolved to cope with terrestrial habitats and 
to produce flowers and seeds. Then, angio- 
sperms (the collective name for flowering 
plants) entered fresh water — an evolution- 
ary step that seems to have been made many 
times. From the freshwater angiosperm line- 
age, eelgrass, along with a few other seagrass 
species, diverged to return to the sea. But 
it seems that this step occurred just three 
times’, indicating the extreme nature of this 
habitat shift. 

Despite the plant’s evolutionary and ecologi- 
calimportance, genetic analysis of eelgrass did 
not begin until the 1990s (ref. 4), more recently 
than for other angiosperms such as wheat, peas 
and land-dwelling grasses. The early studies 
were fraught with setbacks, because it proved 
difficult to purify the DNA and protein vari- 
ants found in different plants of the species. 
Furthermore, scientists have never success- 
fully artificially selected or genetically engi- 
neered eelgrass. Thus, the classic approach of 
crossing plants cannot be used to delve into 
seagrass genetics. Given this problematic 
history, Olsen and colleagues’ sequence repre- 
sents a major advance. 

The complete genome sequence reveals that, 
in moving from calm lakes and ponds to the 
rough, salty ocean, eelgrass lost several key 
gene groups. It lost all of the genes involved 


Figure 1 | Eelgrass ecosystems. The eelgrass Zostera marina is 
home to an abundance of wildlife, including sea anemones. 


in stomata (pores on plant leaves that regulate 
gas exchange and minimize water loss). These 
pores are not essential in seagrasses, because 
the plants are not prone to moisture loss, and 
they absorb dissolved gases directly through 
outer cell layers. The organism also lost genes 
that confer protection from ultraviolet light, 
as well as those involved in sensing far-red 
light — these wavelengths do not penetrate 
very far in coastal waters. 

During the move to the sea, eelgrass regained 
genes encoding cell-wall compounds that 
were lost when marine algae transitioned to 
land. These genes have crucial roles in allow- 
ing osmotic adjustment to salt, and in pro- 
moting nutrient uptake and gas exchange in 
saltwater environments. Other evolution- 
ary innovations include changes that enable 
pollen to stick to stigmas (the tips of female 
flower parts) in salt water, an expanded capac- 
ity to capture light and to photosynthesize in 
dim coastal seas, and a loss of genes encoding 
the proteins that synthesize and sense volatile 
terpenes, compounds that are common in aro- 
matic herbs but that would not be effective in 
an aqueous environment in their putative role 
of deterring predators in the ocean. 

The eelgrass genome is valuable for sev- 
eral reasons. For evolutionary biologists, 
it represents a missing piece in the puzzle 
of angiosperm evolution. It also provides a 
wealth of information that will improve our 
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understanding of diverse biochemical 
pathways. For example, identification of 
the DNA sequences of genes that confer 
tolerance to salt water means that the 
plant could provide a model system in 
which to study how agricultural crops 
might adjust to increasingly saline soils. 

Eelgrass is remarkably adaptable, 
growing under ice in the Arctic Ocean 
and surviving scorching heat in the 
Mexican state of Baja California. It also 
has the largest distribution of any plant 
in the temperate Northern Hemisphere. 
For marine ecologists, the genome is a 
powerful tool for uncovering the adap- 
tations that allow the plant to thrive in 
a wide range of environmental condi- 
tions. This ability to adapt might be the 
key to surviving environmental changes 
such as ocean acidification, warming 
and freshening that are occurring under 
global climate change. It is already 
known that eelgrass populations that 
are more genetically diverse survive 
disturbances better, can be restored 
more rapidly, produce more biomass 
and support more animals than their 
less-diverse counterparts’. Now, the 
genome will help researchers to delve 
into exactly which genetic elements 
facilitate such high biomass production 
and resilience. 

Olsen and colleagues’ genome- 
sequencing feat may have come 
just in time. Seagrass ecosystems are being 
lost rapidly, with seagrass fields disappear- 
ing at a global rate of about one Ameri- 
can-football field every 30 minutes; some 
species and their associated animals are 
even threatened by extinction’. The dis- 
appearances are attributable to human 
disturbances such as building marinas, 
over-fertilizing coastal waters and aquacul- 
ture’. Seagrass restoration and conservation 
efforts are under way across the globe, and an 
understanding of the genes that adapt these 
fascinating species to marine life can only help 
these efforts. m 
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The peptidergic control circuit for sighing 


Peng Li!*, Wiktor A. Janczewski**, Kevin Yackle!*, Kaiwen Kam’t, Silvia Pagliardini*+, Mark A. Krasnow! & Jack L. Feldman’s 


Sighs are long, deep breaths expressing sadness, relief or exhaustion. Sighs also occur spontaneously every few minutes to 
reinflate alveoli, and sighing increases under hypoxia, stress, and certain psychiatric conditions. Here we use molecular, 
genetic, and pharmacologic approaches to identify a peptidergic sigh control circuit in murine brain. Small neural 
subpopulations in a key breathing control centre, the retrotrapezoid nucleus/ parafacial respiratory group (RTN/pFRG), 
express bombesin-like neuropeptide genes neuromedin B (Nmb) or gastrin-releasing peptide (Grp). These project to 
the preBétzinger Complex (preB6tC), the respiratory rhythm generator, which expresses NMB and GRP receptors in 
overlapping subsets of ~200 neurons. Introducing either neuropeptide into preBotC or onto preBotC slices, induced 
sighing or in vitro sigh activity, whereas elimination or inhibition of either receptor reduced basal sighing, and inhibition 
of both abolished it. Ablating receptor-expressing neurons eliminated basal and hypoxia-induced sighing, but left 
breathing otherwise intact initially. We propose that these overlapping peptidergic pathways comprise the core of a 
sigh control circuit that integrates physiological and perhaps emotional input to transform normal breaths into sighs. 


A sigh is a long, deep breath often associated with sadness, yearning, 
exhaustion or relief. Sighs also occur spontaneously, from several per 
hour in humans to dozens per hour in rodents!”. Their recurrence 
during normal breathing enhances gas exchange and may preserve 
lung integrity by reinflating collapsed alveoli? *. Sighing increases in 
response to emotional and physiological stresses, including hypoxia 
and hypercapnia, and in anxiety disorders and other psychiatric 
conditions where it can become debilitating”. 

The core of the breathing rhythm generator is the preBotC, a clus- 
ter of several thousand neurons in ventrolateral medulla. preBotC is 
required for inspiration and generates respiratory rhythms in explanted 
brain slices’~'°. Each rhythmic burst activates premotoneurons and 
motoneurons that contract the diaphragm and other inspiratory mus- 
cles, generating a normal (‘eupneic’) breath!°. Occasionally, a second 
preBotC burst immediately follows the first, and this ‘double burst’ 
leads to the augmented inspiration of a sigh!!~", typically about twice 
the volume of a normal breath". Thus, the command for normal 
breaths and sighs both appear to emanate from preBotC. 

A variety of neuromodulators and neuropeptides!>’*, including frog 
bombesin’», can influence sighing in rodents. However, the endoge- 
nous sigh control pathways have not been identified. We tested the 
effect of injecting bombesin into preBotC””, and we screened for genes 
selectively expressed in breathing control centres (K.Y. and M.A.K., 
in preparation). These parallel approaches led to identification of two 
bombesin-like neuropeptide pathways connecting the RIN/pFRG, 
another medullary breathing control centre'*!, to preBotC. We pro- 
vide genetic, pharmacologic and neural ablation evidence that these 
pathways are critical endogenous regulators of sighing, and define the 
core of a dedicated sigh control circuit. 


Neuromedin B links two breathing control centres 

To identify breathing control genes, we screened >19,000 gene expres- 
sion patterns in embryonic day 14.5 mouse hindbrain” (K.Y. and 
M.A.K., in preparation). The most specific pattern was Nmb, one of 
two genes encoding bombesin-like neuropeptides in mammals. Nmb 


is expressed in the medulla surrounding the lateral half of the facial 
nucleus, in or near RTN/pFRG in mouse (Fig. la, b and Extended Data 
Fig. 1a) and rat (Extended Data Fig. 1b). Nmb mRNA was also detected 
in the olfactory bulb and hippocampus (Extended Data Fig. 1c, d). 

Nmb expression was further characterized using an Nmb-GFP 
BAC transgene with the Nmb promotor driving GFP expression, 
which reproduced the endogenous Nmb pattern (Fig. 1b, c). Nmb- 
GFP expressed in 206 + 21 (mean + standard deviation (s.d.), n=4) 
RTN/pERG neurons per side, most of which (92%, n =53 cells scored) 
co-expressed Nmb mRNA (Extended Data Fig. le-h). In CLARITY- 
processed”! brainstems, GFP-labelled cells surrounded the lateral half of 
the facial nucleus, with the highest density ventral and dorsal (Fig. 1d, e 
and Supplementary Video 1). This ventral parafacial region is the 
RTN, an important sensory integration centre for breathing!*???, 
Nearly all Nmb-GEP-positive cells (96%; n =202 cells from 2 animals) 
co-expressed canonical RTN marker PHOX2B”* (Fig. 1f), comprising 
one-quarter of the ~800 PHOX2B-positive RTN neurons”. 

Nmb-expressing neurons projected to preBotC (Fig. 1g, j). Punctate 
NMB staining was detected along the projections (Extended Data 
Fig. 2), with some puncta abutting somatostatin (SST)-positive preB6tC 
neurons (Fig. 1h and Extended Data Fig. 2). Approximately 90 preBotC 
neurons expressed Nmbr, the G-protein-coupled receptor specific for 
NMB (Fig. li, see later). Thus, Nmb-expressing RTN/pFRG neurons 
may directly modulate preBotC neurons. 


NMB injection into preBotC induces sighing 

To investigate the function of this NMB pathway, the peptide was micro- 
injected into preBotC of urethane-anaesthetized adult rats. Before injec- 
tion, and after control (saline) injections (data not shown), airflow and 
diaphragmatic activity (DIAgma) showed the normal (eupneic) breath- 
ing pattern, with diaphragm activity bursts during inspiration (Fig. 2a, b 
and Extended Data Fig. 3a). Every minute or two, we observed a sigh 
(44+ 10h7!, n= 24; unaffected by saline injection, Extended Data 
Fig. 3b), a biphasic double-sized breath coincident with a biphasic 
DIAgmc event (Fig. 2b and Extended Data Fig. 3a, c-f). Amplitude 


1Department of Biochemistry and Howard Hughes Medical Institute, Stanford University School of Medicine, Stanford, California 94305, USA. @Systems Neurobiology Laboratory, Department of 
Neurobiology, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, California 90095, USA. +Present addresses: Department of Cell Biology and Anatomy, Chicago 
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Figure 1 | NMB neuropeptide pathway neurons in breathing centre. 

a, PO mouse brain section probed for Nmb mRNA (green) with DAPI 
counterstain (nuclei, blue). Scale bar, 1 mm. b, Boxed region (a) showing 
specific expression in RTN/pFRG. Scale bar, 100j1m. c, Whole mount 

PO brainstem (ventral view) showing Nmb-GFP transgene expression 
(GEP, green) bilaterally in RTN/pFRG. Scale bar, 0.5 mm. d, e, Three- 
dimensional reconstruction (sagittal (d), coronal (e) projections) of 
CLARITY-cleared P14 Nmb-GFP brainstem. Note, RTN/pFRG expression 
ventral, dorsal, and lateral to facial nucleus. Numbers, representative 
neurons. A, anterior; V, ventral; M, medial. Scale bar, 100 1m. f, PO Nmb- 
GFP-expressing neurons (green) in RTN/pFRG (dashed) co-express RIN 
marker PHOX2B (red). Scale bar, 50 j1m. g, P7 Nmb-GFP-expressing 
neurons (green) project to preBétC (dashed). SST (somatostatin), preBotC 
marker (white). Asterisk indicates isolated GFP-labelled neuron in facial 
nucleus. Scale bar, 100|1m. h, Boxed region (g) with NMB co-stain (z-stack 
projection; optical sections, Extended Data Fig. 2). Arrowhead, NMB 
puncta (red) in Nmb-GFP-expressing projection (green) abutting preB6tC 
neuron (SST, white). Scale bars, 101m (1 1m, inset). i, P7 ventral medulla 
section probed for Nmbr mRNA (purple) showing preB6tC expression. 
Scale bar, 100 1m. j, Tiled image (left) and tracing (right) of Nmb-GFP 
neuron as in g projecting to preBotC. Scale bar, 30m. 


and timescale of the first component of a sigh was indistinguishable 
from eupneic breaths, like human sighs”®. Following bilateral NMB 
microinjection (100 nl, 31M) into preBotC, sighing increased 6-17-fold 
(n=5; Fig. 2a, c, d and Extended Data Fig. 4a-e). The effect peaked 
several minutes after injection and persisted for 10-15 min afterwards. 
We also tested NMB on explanted preB6tC brain slices of neona- 
tal mice, where inspiratory activity is detected as rhythmic bursts of 
preBotC neurons and hypoglossal (cranial nerve XII) motoneuron 
output (Fig. 2e). Occasionally, a burst with two peaks (‘doublet’) 
was observed (Fig. 2e)"', a proposed in vitro signature of a sigh (see 
Methods). Addition of 10nM and 30nM NMB increased doublet 
frequency by 1.7-fold (P= 0.005; n=7) and twofold, respectively 
(P= 0.003; n =7) (Fig. 2e, f). The overall frequency of bursts and 
doublets together was unchanged (P= 0.2; n =7), implying NMB con- 
verts inspiratory bursts into sighs; indeed, in some preparations every 
inspiratory burst was converted to a doublet (Extended Data Fig. 5). 
We conclude that NMB acts directly on preBotC to increase sighing. 


NMBR signalling maintains basal sighing 
To determine if NMB signalling is required for sighing, we monitored 
breathing of awake, unrestrained Nmbr~'~ knockout mice. Wild-type 
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controls (C57BL/6) sighed 40 + 11 h—!, whereas Nmbr~/~ mutants 
sighed 29 + 10h~! (n=4; P< 0.001) (Fig. 2g). Sighing was also tran- 
siently reduced ~50% in anaesthetized rats by NMBR inhibition 
following bilateral preB6tC injection of the antagonist BIM23042 
(100 nl, 641M) (Fig. 2h and Extended Data Fig. 6a—d and 7a). The antag- 
onist effect was selective for sighing as it did not significantly alter 
respiratory rate (117 + 14 vs 109 +6 breaths per minute with antag- 
onist, n = 4, P=0.14) or tidal volume (2.1 + 0.1 vs 2.0+0.3 ml, n=4, 
P=0.46), and similar selectivity was observed for the Nmbr mutant 
(respiratory rate 218 +22 vs 254+22 in Nmbr~'~ mice, n=4, P=0.06; 
tidal volume 0.30 + 0.03 vs 0.31 £0.04ml, n=4, P=0.88). Thus NMVBR 
signalling in preBétC maintains basal sighing. 


A related neuropeptide pathway modulates sighing 
Nmbr mutations and inhibition reduced but did not abolish sigh- 
ing, suggesting involvement of other pathways. Grp, the only other 
bombesin-like neuropeptide gene in mammals’, was expressed in 
several dozen cells in the dorsal RTIN/pFRG in mouse (Fig. 3a) and rat 
(Extended Data Fig. 8e), plus scattered cells in nucleus tractus solitarius 
(NTS) and parabrachial nucleus (PBN), two other breathing circuit 
nuclei'®* (Extended Data Fig. 8a—d). GRP-positive projections were 
traced from RTN/pFRG to preBotC, with some GRP puncta abutting 
SST-positive preBotC neurons (Extended Data Fig. 8f-i). GRP signals 
through GRPR, the receptor most similar to NMBR. Grpr mRNA was 
detected in ~160 mouse preB6tC neurons (Fig. 3b and see later), sug- 
gesting GRP can also directly modulate preBotC function. 

To determine if GRP regulates sighing, the neuropeptide (100 nl, 
3 1M) was injected bilaterally into preBotC of anaesthetized rats. 
Sighing increased 8-16-fold (n =5; Fig. 3c and Extended Data 
Fig. 4f-j). GRP (3nM) application to mouse preBotC brain slices also 
increased sighing, producing 1.7-fold more doublets (P= 0.003; n= 9; 
Fig. 3d). Thus, GRP can induce sighing through direct modulation of 
preBotC neurons, like NMB. 

To determine if GRPR signalling is required for sighing, we 
monitored breathing in Grpr-‘~ knockout mice. Their basal sigh rate 
(22 £9 per hour, n =4) was half that of control wild-type mice (Fig. 3e), 
whereas eupneic breathing appeared normal (respiratory rate 218 + 22 
vs 210+16in Grpr—"-, n=4, P=0.57; tidal volume 0.30 + 0.03 vs 
0.28 £0.01 ml, n=4, P=0.23). GRPR inhibition by bilateral preBotC 
injection of antagonist RC3095 (100 nl, 6|1M) in anaesthetized rats 
also transiently decreased sighing by ~50% (n = 4), followed by rapid 
rebound and overshoot (Fig. 3f and Extended Data Fig. 6e-h). There 
was no significant change in other respiratory parameters (respiratory 
rate 117 +12 vs 111411 with antagonist, n =4, P=0.34; tidal volume 
2.0+0.2 vs 1.9+0.1ml,n=4, P=0.11). Thus, GRPR signalling in 
preBotC also maintains basal sighing. 

Expression patterns, loss-of-function phenotypes and localized 
pharmacological manipulations of NMBR and GRPR signalling in 
preBotC suggest that NMVB-NMBR and GRP-GRPR pathways can 
independently modulate sighing. 


NMBR and GRPR are the critical pathways in sighing 
To explore the relationship between NMB and GRP pathways, we 
compared expression patterns of the neuropeptides and receptors 
within mouse RTN/pFRG and preBétC. Nmb and Grp were detected 
in non-overlapping neuronal subpopulations, with Nmb neurons dis- 
tributed throughout RTN/pFRG and Grp neurons restricted to the 
dorsal domain (Fig. 4a—d). In contrast, receptor expression patterns 
in preBétC overlapped (Fig. 4e-h), with 40 + 16 neurons expressing 
Nmbr, 113 + 45 expressing Grpr, and 49 +9 expressing both (n = 3). 
To explore functional interactions, we injected both neuropeptides 
into preBotC of anaesthetized rats. Sigh rate increased 12-24-fold, 
which was similar or slightly beyond that of either neuropeptide alone 
(Fig. 4i and Extended Data Fig. 4k-o). When NMBR and GRPR path- 
ways were simultaneously inhibited by bilateral injection of both antag- 
onists, BIM23042 (100 nl, 61M) and RC3095 (100 nl, 61M), sighing was 
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Figure 2 | NMB effect on breathing. a—c, Breathing activity of 
anaesthetized rat following bilateral NMB injection (100 nl, 31M) into 
preB6tC. Note increased sighing (spikes in tidal volume (V7), integrated 
diaphragm activity (JDia)), but little change in respiratory rate (frequency, f). 
Bar, 1 min. b, c, Similar, stereotyped waveforms of spontaneous (b) and 
NMB- induced (c) sighs (from a; also Extended Data Fig. 3a, c-f). Bar, 

2s. d, Quantification of (a). Top: raster plots of sighs (tics) in five rats 
following NMB injection (grey); numbers, highest instantaneous sigh rate 
(red tics). Bottom: instantaneous sigh rate of bottom raster plot; numbers, 
average instantaneous sigh rate before and maximum (and fold increase) 
after injection. e, Integrated hypoglossal nerve ({XII; black) and preBétC 


severely reduced or eliminated (n =6; Fig. 4j, Extended Data Fig. 6i-n). 
Thus, NMBR and GRPR pathways can independently modulate 
sighing, and together are required for basal sighing in vivo. 


Effect of NMBR and GRPR neuron ablation 

To determine if preB6tC NMBR- and GRPR-expressing neurons 
function specifically in sigh control, we ablated them using bombesin 
(BBN), which binds both receptors’, conjugated to saporin 
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Figure 3 | GRP neuropeptide pathway expression and function in 
breathing. a, b, Sagittal ventral medulla sections of P7 mice probed for 
Grp (a) or Grpr (b) mRNA (purple). Scale bar, 200 im. c, Effect on sighing 
of bilateral preBotC injection of GRP (100 nl, 31M), as in Fig. 2d. 

d, Effect of GRP on doublets (sighs) in preBotC slices, as in Fig. 2f. Data as 


neural activity (JpreBotC; grey) in preBstC slices containing indicated 
NMB concentrations. NMB increases doublets (*), a sigh signature in 
slices. Bar, 10s. f, Quantification of (e) (data as mean+s.d. n=7; 

*P < 0.05 by paired t-test). g, Basal sigh rate in C57BL/6 wild-type (WT) 
and Nmbr~'~ mice. n= 4; data as mean +s.d.; *P < 0.001 by unpaired 
t-test. h, Effect on sighing in anaesthetized rats of bilateral preBétC 
injection (grey) of NMBR antagonist BIM23042 (100 nl, 61M). Top: 
raster plots; numbers, longest inter-sigh intervals (s, seconds) following 
injection. Bottom: sliding average sigh rate (bin 4 min; slide 30s); 
numbers, average rate before (left) and minimum binned rate after 
injection (right). 


(BBN-SAP), a ribosomal toxin that induces neuronal death when 
internalized®. Three days after bilateral BBN-SAP injection (200 nl, 
6.15 ng per side) into preB6tC of rats, sighing was reduced ~80%, 
from 24+3h7' before injection to 5+4h7' three days after injection 
(P=5x10~%n=7) (Fig. 5a). The effect was selective as other aspects 
of breathing and behaviour appeared normal. Five days after injection, 
sighing was almost completely (~95%) abolished, decreasing to 
0.6+0.6h~! (P=107*; n=6; Fig. 5a). Other aspects of breathing and 
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mean +s.d. n= 9; *P < 0.05 by paired t-test. e, Basal sigh rate in C57BL/6 
wild-type (WT) and Grpr-'~ mice. n= 4; data as mean +s.d.; *P< 0.001 
by Mann-Whitney U-test. f, Effect on sighing of bilateral preBotC 
injection of GRPR antagonist RC3095 (100 nl, 641M), as in Fig. 2h. 
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Figure 4 | Interactions between NMB and GRP pathways in sighing. 
a-d, RTN/pERG section of P7 Nmb-GFP mouse immunostained for GFP 
(green, arrowheads) and probed for Grp mRNA (red, arrows). Note no 
expression overlap. Scale bar, 30 1m. e-h, preBotC section of P28 mouse 
probed for Nmbr mRNA (green, arrowheads) and Grpr mRNA (red, 
arrows). Note partial expression overlap. Scale bar, 30m. i, Effect on 
sighing of bilateral preBotC injection of both NMB (100 nl, 3 j1M) and 
GRP (100 nl, 341M) as in Fig. 2d. j, Effect on sighing of bilateral preBotC 
injection (100 nl, 61M) of both NMBR and GRPR antagonists (BIM23042, 
RC3095) as in Fig. 2h. 


behaviour again appeared generally intact (Extended Data Fig. 9a, b). 
However, after 5 days we noted increasing episodes of apneas or dis- 
ordered breathing, possibly a consequence of the loss of sighing (see 
Discussion). Ablation prevented sigh induction by exogenous BBN 
infusion into the cisterna magna, confirming preBotC Nmbr- and Grpr- 
expressing neurons were eliminated (Extended Data Fig. 9c). However, 
BBN infusion still triggered intense scratching and licking, demonstrating 
that the Grpr and Nmbr neurons outside the preB6tC required for these 
behaviours”? remained intact. We conclude that preBétC Nmbr and 
Grpr-expressing neurons have a critical and selective function in basal 
sighing. 


NMBR and GRPR neurons are critical for induced sighs 
To determine if the Nmbr and Grpr-expressing neurons are also impor- 
tant for physiologically-induced sighs, we examined BBN-SAP rats 
exposed to hypoxia (8% O32). In control rats injected with unconju- 
gated SAP, sighing increased from 24 +5 to 140 + 14h7! under hypoxia 
(n=3, P=0.01). In contrast, five days after BBN-SAP injection, sigh 
rate under hypoxia was 5.17.9 h-! (n=6, P=0.2, hypoxia vs room air 
at day 5; Fig. 5b); three of these rats did not sigh in room air (21% O2) 
and no sighs were triggered by hypoxia. In BBN-SAP rats, hypoxia 
increased respiratory rate from 150 +5 to 230+ 19 breaths per min, 
demonstrating ventilatory response to hypoxia was intact. Thus, Nmbr- 
expressing and Grpr-expressing neurons are also critical for hypoxia- 
induced sighing, but not other respiratory responses to hypoxia. 


Discussion 

Our results show sighing is controlled by two largely parallel 
bombesin-like neuropeptide pathways, NMB and GRP, which 
mediate signalling between key medullary breathing control 
centres. Approximately 200 Nmb-expressing and ~30 Grp-expressing 
neurons in neighbouring domains of RTN/pFRG, a region impli- 
cated in integrating respiratory sensory cues and generation of active 
expiration?*”*°, project to preBotC, the respiratory rhythm generator. 
A total of 7% (~200) of preBétC neurons express Nmbr (~40 neurons), 


296 | NATURE | VOL 530 | 18 FEBRUARY 2016 


160- 
30 5 BBN-SAP BBN-SAP 
ablation “il ablation 
12074 
_ 207 x 4 
1 
o 4 2 804 
= oD 
a a a 
? 404 
| 40- 
; | ee 
S Lo © 0 
SS > > © 
SEP SF & Fp yp 
ie) & cow oO 9 
S om 
c 
Sigh-activation 
rey Ne neurons 
Integration od ‘3 
neurons a ‘ Aanone 


(~110) 
GRP 
neurons 


(-30) NMBR 


neurons 
(~40) 
NMB 
neurons 
(~200) 


——=— 


NMBR/GRPR 
neurons 
(~50) 


Figure 5 | Effect on sighing of ablating preBétC NMBR-expressing 

and GRPR-expressing neurons. a, b, Basal (a) and hypoxia-induced 

(b) sigh rates before (control) and 3 or 5 days after preB6tC injections of 
bombesin-saporin (200 nl, 6.2 ng; BBN-SAP ablation) to ablate NMBR- 
and GRPR-expressing neurons, or 5 days after saporin alone (200 nl, 6.2 ng; 
Blank-SAP). Data as mean +s.d.; P< 0.05 (BBN-SAP ablation vs control 
rats at day 3 or 5 for both room air (a) and hypoxia (b)). Sample size n =6 
(control), 3 (Blank-SAP), 7 (day 3), 6 (day 5). c, Model of peptidergic sigh 
control circuit. NMB- and GRP-expressing neurons in RTN/pFRG 

(and perhaps GRP-expressing neurons in NTS and PBN) receive 
physiological and perhaps emotional input from other brain regions, 
stimulating neuropeptide secretion. This activates receptor-expressing 
preBOotC neurons expressing their receptors, which transform the normal 
preBotC rhythm to sighs. (Because neuropeptides induce sighs separated 
by normal breaths (Fig. 2a), there must be some refractory mechanism in 
or downstream of receptor-expressing neurons that temporarily prevents a 
second sigh.) 


Grpr (~110) or both receptors (~50), activation of which increased 
sighing 6-fold to 24-fold, whereas sighing was effectively abolished 
by inhibition or deletion of the receptors, or ablation of the receptor- 
expressing neurons. 

We propose the above neurons, perhaps with Grp-expressing NTS 
and PBN neurons, comprise the core of a peptidergic sigh control 
circuit, with the neuropeptide-expressing neurons integrating inputs 
from sites monitoring physiological and perhaps emotional state 
(Fig. 5c). Excitation of these neurons and secretion of either neuropeptide 
activates the cognate receptor-expressing preB6tC neurons, which 
initiate sighs by altering activity of other preBétC neurons to convert 
normal breaths to sighs. This might occur by a burst of NMB and/or 
GRP secretion triggering a second inspiratory signal in the preB6tC 
during or immediately after the first, resulting in a single, double-sized 
breath. Alternatively, NMB and GRP secretion might be more gradu- 
ally modulated, causing concentration-dependent bursts in activity of 
receptor-expressing neurons or a shift in preB6tC properties towards 
states favouring more frequent doublet bursts. 

A priority will be to elucidate the full circuit and properties of the 
constituent neurons, including its integration with other peptides and 
neurotransmitters that influence sighing!>'® and with normal breathing 
and other behaviours. A curious aspect of the circuit already apparent 
is the central role of two partially overlapping and closely related neu- 
ropeptide pathways. Do NMB- and GRP-expressing neurons receive 
different inputs and have distinct sensing functions, and do the three 
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sets of receptor-expressing neurons (NMBR, GRPR, NMBR+ GRPR) 
converge on the same preB6tC neurons to effect a sigh, or signal to 
different preBotC neurons producing distinct types of sighs? 

A striking aspect of our results is the selectivity of the circuit for sighing. 
Inhibition of the pathways, and even ablation of receptor-expressing 
neurons, had little effect on other aspects of breathing, at least in the 
short term. This can now be exploited to test the classical hypothesis 
for the physiological function of sighing—re-expansion of alveoli that 
collapse during breathing and maintenance of lung integrity*~*, and 
to investigate psychological benefits. Identification of the key neuro- 
peptide pathways suggests pharmacologic approaches for controlling 
excessive sighing and inducing sighs in patients that cannot breathe 
deeply on their own. Dozens of molecularly distinct preBotC neuronal 
types have been identified recently (K.Y. and M.A.K., in preparation); 
perhaps they serve similarly specific roles in other respiratory-related 
behaviours like yawning, sniffing, crying, and laughing. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Animals. All procedures were carried out in accordance with animal care standards 
in National Institutes of Health (NIH) guidelines, and approved by the University 
of California, Los Angeles Animal Research Committee, or Stanford Institutional 
Animal Care and Use Committee. All mouse strains used were in the C57BL/6 
genetic background. Nmb-GFP BAC transgenic mice (Tg(Nmb-EGFP)IT50Gsat/ 
Mmucd), carrying EGFP coding sequence inserted upstream of the Nmb start 
codon in the BAC, were from the Mutant Mouse Regional Resource Centers 
(catalogue number 030425-UCD, https://www.mmrre.org). Nmbr~/~ and 
Grpr~/~ null mutant mice, in which exon 2 of the endogenous gene was replaced 
by a neomycin-resistance cassette using homologous recombination, have been 
described*!*. Male Sprague Dawley rats were from Charles River. 

In situ hybridization, immunostaining and reporter expression. For in situ 
hybridization, mouse brains were harvested and fixed overnight in 4% paraform- 
aldehyde in phosphate buffered saline (PBS), cryopreserved in 30% sucrose and 
embedded in optical cutting temperature compound (OCT). Transverse sections 
were cut at 16-20,1m and stored at —80°C until use. Sections were post-fixed 
in 4% paraformaldehyde before treatment with hydrochloric acid, proteinase K, 
and then triethanolamine/acetic anhydride. Hybridization was carried out with 
in vitro transcribed and digoxygenin-labelled riboprobes at 58 °C overnight. Signal 
was detected using alkaline phosphatase-coupled anti-digoxigenin (DIG) primary 
antibody (Roche) and nitro blue tetrazolium chloride and bromochloroindolyl 
phosphate (NBT/BCIP) Reagent Kit (Roche) or using Horse Radish Peroxidase- 
coupled anti-DIG primary antibody (Roche) and Tyramide Signal Amplification 
plus Fluorescent Substrate Kit (PerkinElmer). 

For double fluorescent in situ hybridization, tissue was harvested, embedded in 
OCT and then sectioned. Sections were fixed in 4% paraformaldehyde, dehydrated 
and treated with pretreatment reagent (Advanced Cell Diagnostics). Double fluo- 
rescent in situ assay was then performed using proprietary RNAscope technology 
(Advanced Cell Diagnostics) with cyanine 3 labelled Nmbr probes and fluorescein 
isothiocyanate labelled Grpr probes. 

For Nmb-GFP reporter expression analysis, the brains of Nmb-GFP mice 
were harvested and fixed overnight in 4% paraformaldehyde and then cryopre- 
served at 4°C in 30% sucrose overnight. Tissue was embedded in OCT and sec- 
tioned at 10-40 1m. Tissue sections were rinsed with PBT (PBS + 0.1% Tween), 
blocked with 3% bovine serum (BSA) in PBT for 1h, and incubated with primary 
antibody overnight at 4°C. Sections were rinsed in PBT and incubated for 1h 
at room temperature with species-specific secondary antibodies. Primary anti- 
bodies were: chicken anti-GFP (Abcam 13970; used at 1:1,000 dilution), goat anti- 
PHOX2B (Santa Cruz, sc-13224; 1:200 dilution), rabbit anti- NMB (Sigma-Aldrich, 
SAB1301059; 1:100 dilution), rabbit anti-GRP (Immunostar 20073; 1:4,000 dilu- 
tion), and rat anti-SST (Millipore, MAB354; 1:50 dilution). Secondary antibodies 
included donkey anti-chicken (Jackson Immuno Research; 1:400 dilution), don- 
key anti-rat (Jackson Immuno Research; 1:400 dilution), donkey anti-rabbit 
(Jackson Immuno Research; 1:400 dilution) and donkey anti-goat (Invitrogen; 
1:500 dilution). 

For Nmb-GFP expression analysis in samples prepared by CLARITY”!, Nmb- 

GFP mice were perfused with PBS and formaldehyde-acrylamide hydrogel, and 
brain tissue was harvested and incubated in hydrogel monomer solution at 4°C for 
3 days. Tissue was then embedded in polymerized hydrogel by raising the temper- 
ature to 37°C for 3h. Blocks of 1 mm thickness were cut and washed in 4% sodium 
dodecyl sulphate (SDS) in sodium borate buffer at 37°C for 2 to 3 weeks. Samples 
were washed with PBST for 2 days and incubated in FocusClear (CelExplorer), and 
GFP fluorescence was imaged on a Zeiss LSM780 confocal microscope. 
Sigh monitoring and analysis. For awake animals, individual animals were placed 
in a whole body plethysmography chamber (Buxco) at room temperature (22°C) 
in 21% QO, (for normoxia) or 8% O, (for hypoxic challenge) balanced with N>. 
Sighs were identified in plethysmography traces by the characteristic biphasic 
ramp, the augmented flow in the second phase of the inspiratory effort and the 
prolongation of expiratory time following the event. Sighs were also confirmed by 
visual monitoring of breathing behaviour. Given the high amplitude and distinctive 
waveform of sighs relative to standard eupneic breaths, sighs were unambiguously 
identified by both visual and computer-assisted scoring; no difference was detected 
in direct comparisons between methods or observers, so visual scoring was used. 
Female 8-week-old mice (Nmbr~/~, Grpr/ ~, or C57BL/6 as wild-type control) 
were acclimated to the chamber for 10 min, and then the first fifteen recorded 
sighs were used to calculate the sigh rate. Similar sigh rates were observed for each 
animal when assayed on different days. Rats were acclimated to the chamber for 
~1h, and then baseline sigh rate and respiratory frequency were calculated for the 
next 2h. For hypoxic (8% O2) challenge, analysis was continued for 30 min under 
the hypoxic condition. 


For anaesthetized rats, the trachea was cannulated and connected to a 
pneumotachograph (GM Instruments) to record airflow. A flow calibration was 
performed after every experiment along with a calculation of tidal volume (Vr) 
by digital integration. To monitor diaphragm activity, wire electrodes (Cooner 
Wire) were implanted into the diaphragm and electromyogram (EMG) signal 
sampled at 2kHz (Powerlab 16SP; AD Instruments). Signal was rectified and digitally 
integrated (time constant of 0.1) to obtain a moving average using LabChart Pro 
8 (AD Instruments) and Igor Pro 6 (Data Matrix) software. Sighs were identified 
in the airflow measurements as above and validated by double peaks in the EMG 
recordings. 
preB6tC injection of NMB and GRP agonists and antagonists. Male Sprague 
Dawley rats (n = 24) weighing 320-470 g were anaesthetized with urethane 
(1.5 g per kg), isoflurane (0.3-0.7 vol%), and ketamine (20 mg per kg per hour) 
and injected i.p. with atropine (0.3 mg per kg), then placed in a supine position 
in a stereotaxic instrument (David Kopf Instruments). A tracheostomy tube was 
placed in the trachea through the larynx, and the basal aspect of the occipital bone 
was removed to expose the ventral medulla. Injections were placed 750 jm caudal 
from the most rostral root of the hypoglossal nerve (RRXII), 2 mm lateral to the 
midline, and 700|1m dorsal to the ventral medullary surface. Small corrections 
were made to avoid puncturing blood vessels on the medulla. Microinjection was 
done using a series of pressure pulses (Picospritzer; Parker-Hannifin) applied to 
the open end of micropipettes, with air pressure set so that each pulse ejected ~5 nl 
and a total volume of 0.1 11 was injected on each side. The concentration of injected 
neuropeptides and antagonists were: NMB, 3|1M; GRP, 31M; NMB and GRP, 3.M 
each; BIM23042 (ref. 33), 64M; RC3095 (ref. 34), 611M; BIM23042 and RC3095, 
61M each. To verify the accuracy of the injections, fluorescent polystyrene beads 
(0.2 um FluoSpheres (Invitrogen; catalogue #F8811, #F8763 or #F8807); 2-5% vol) 
were added to the injected solutions, and following injection and physiological 
measurements the location of the fluorescent beads was visualized in wet tissue 
double stained for reelin and choline acetyltransferase (ChAT) to identify the 
preBotC. 

Ablation of NMBR and GRPR-expressing preB6tC neurons. Bilateral preBétC 
injections of saporin conjugated with either bombesin (BBN-SAP; Advanced 
Targeting System; 200 nl, 6.2 ng) or a non-targeted peptide (blank SAP; Advanced 
Targeting System; 200 nl, 6.2ng) were performed in 300-350 g rats under 
anaesthesia (ketamine (90 mg per kg), and xylazine (10 mg per kg), administered 
ip) using standard aseptic procedures. Rats were positioned on a stereotaxic frame 
with bregma 5 mm below lambda. The occipital bone was exposed and a small 
window was opened to perform BBN-SAP injections with a 40|1m diameter tip 
glass pipette inserted into the preB6tC. Coordinates were (in mm): 0.9 rostral, 
2.0 lateral, and 2.8 ventral to the obex. The electrode was left in place for 5 min after 
injection to minimize backflow of solution up the electrode track. After injection, 
a fine polyethylene cannula was implanted and cemented to the occipital bone to 
deliver BBN into the fourth ventricle. Neck muscles and skin were sutured back 
at the end of the surgery and rats were allowed to recover with pain medication, 
food and water ad libitum. Blank-SAP and BBN-SAP treated rats were tested 
for hypoxia (8% O, balanced with nitrogen, 30 min challenge) five days after 
surgery. Blank-SAP and BBN-SAP treated rats were also tested for response to 
BBN infusion in the cistern magna six days after surgery. The cannula implanted in 
the fourth ventricle was connected to a fine polyethylene tubing under isoflurane 
anaesthesia, and after recovery and placement of rats in a plethysmographic cham- 
ber, 10j1g of BBN diluted in 20,11 sterile saline was delivered followed by a 2011 
saline washout. Sigh rate and respiratory rate were calculated for 30 min following 
infusion and compared to pre-infusion values. 

In vitro slice preparation, recording, and analysis. Rhythmic 550-j1m-thick 
transverse medullary slices containing the preBotC and XII nerve from neonatal 
C57BL/6 mice (PO-5) were prepared as described previously". The medullary 
slice was cut in artificial cerebrospinal fluid (ACSF) containing (in mM): 124 NaCl, 
3 KCL, 1.5 CaCh, 1 MgSOy, 25 NaHCO3, 0.5 NaH»POs,, and 30 p-glucose, equili- 
brated with 95% O and 5% CO, (4°C, pH =7.4). For recording, extracellular KT 
was raised to 9mM to replace excitatory afferent drive lost in the cutting process. 
Slices were perfused at 27°C and 4ml min’ and allowed to equilibrate for 30 min. 
Respiratory activity reflecting suprathreshold action potential (AP) firing from 
populations of neurons was recorded as XII bursts from either XII nerve roots 
and as population activity directly from the preB6tC using suction electrodes and 
a MultiClamp 700A or 700B (Molecular Devices, Sunnyvale, CA, USA), filtered at 
2-4kHz and digitized at 10 kHz. Digitized data were analysed off-line using custom 
procedures written for IgorPro (Wavemetrics, Portland, OR, USA). Activity was 
full-wave rectified and digitally integrated with a Paynter filter with a time con- 
stant of 20 ms with either custom built electronics or using custom procedures in 
MATLAB (Mathworks, Natick, MA, USA). 
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Burst detection and analysis of respiratory-related activity recorded in full-wave 
rectified XII output or preBotC population recordings were performed using cus- 
tom software written in IgorPro. Burst parameters were normalized to the mode of 
the data in the baseline condition. Although there are several proposed definitions 
of sighs in slice preparations'”"’, here we used ‘doublets’ (double-peaked bursts) as 
the in vitro signature of sighs because, in our preparations, doublets detected both 
in preB6tC neural population activity measurements and cranial nerve XII output 
recordings shared the increased inspiratory and expiratory duration of sighs". 
Furthermore, as demonstrated here, doublet rate increased following application 
of NMB or GRP to preB6otC slice preparations (Figs 2 and 3 and Extended Data 
Fig. 5), as did sigh rate in vivo following preB6tC injection of the same neuropep- 
tides (Figs 2 and 3). The frequency and waveform of doublets in slice preparations 
does not closely match those of sighs in intact animals, presumably due to the 
absence in vitro of important inputs modulating burst shape; indeed, the doublets 
more closely resemble sighs in vagotomized animals, where they appear as equal 
amplitude double-peaked breaths!+!”. We scored a burst as a doublet if the burst 
displayed a second peak that reached 20% or more of the amplitude of the first 
burst, and this second peak occurred after more than twice the time from start to 
peak or if the burst had a duration longer than eight times the time from start to 
peak. All doublets were verified by visual inspection to exclude multipeaked bursts 
and two bursts that were too far apart. Measured doublet intervals were converted 
to a calculated per hour doublet rate. 

Statistics. Data are represented as mean + standard deviation (s.d.). Statistical 
significance was uniformly set at a minimum of P < 0.05. For comparisons 
of two groups, the assumption for normal distribution was determined by the 
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Shapiro- Wilk test with the critical W value set at 5% significance level. The t-tests 
were conducted, with the exception in Fig. 3e, in which a Mann-Whitney U-test 
was used. For statistical comparisons of more than two groups, an ANOVA was 
first performed. In most cases, a two-way repeated measures ANOVA was used for 
comparisons of various parameters in different conditions and for making com- 
parisons across different events. If the null hypothesis (equal means) was rejected, 
post-hoc paired t-tests were then used for pairwise comparisons of interest. Individual 
P values are reported, but Holm-Bonferroni analysis for multiple comparisons was 
conducted to correct for interactions between the multiple groups. Histograms 
were normalized by the total sample size to generate plots of the relative frequency 
of each value where the y value of each bin represents the fraction of the total 
number of samples for that experiment. Randomization and blinding were not 
used. No statistical method was used to predetermine sample size. 
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Extended Data Figure 1 | Expression of Nmb in rodent brain. 

a, b, Sagittal sections of P7 mouse (a) and P7 rat (b) brain showing 
RTN/pFRG region probed for Nmb mRNA expression (purple) by in situ 
hybridization as in Fig. 1. Scale bars, 100 1m. c, d, Nmb expression as ina 
showing regions outside ventrolateral medulla. Nmb is expressed in mouse 
olfactory bulb (c) and hippocampus (d). Scale bars, 200 1m (c) and 100 pm 
(d). e-h, Section through RTN/pFRG brain region of PO transgenic Nmb- 
GFP mouse immunostained for GFP (green) and probed for Nnb mRNA 
(red) by in situ hybridization. Blue, DAPI nuclear stain. Nmb-GFP and 
Nmb mRNA are mainly co-expressed in the same cells. Scale bar, 100 1m. 
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Extended Data Figure 2 | Serial confocal preBétC sections showing 
Nmb-GFP projections contain puncta of NMB. a-d, Serial confocal 
optical sections (0.6 |1m apart) through preBétC brain region of Nmb-GFP 
mouse immunostained for GFP (green), NMB (red), preBotC marker SST 
(white), and DAPI (blue) as in Fig. 2h. Note the GFP-positive projection 
with a puncta of NMB (yellow, open arrows in b, c) directly abutting an 


SST positive neuron (asterisk). Most NMB puncta (open arrowheads) were 
detected within GFP-positive projections as expected, and only a small 
fraction of NMB puncta (closed arrowhead) were detected outside them; 
NMB outside Nmb-GFP projections could be secreted protein or the rare 
Nmb-expressing cells that do not co-express the Nmb-GFP transgene 

(see Extended Data Fig. le-h). Scale bar, 20 1m. 


©) 2016 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


a b 
60 


£ 

Oo 
—e 
—o— 


Dia [Dia 
Sighs/hour 
8 


) 
Control Saline 
d e f 
. d e f 

i (aoe epee nema er wee om 
[Dia 5] 
Flow 2 
4 
Vy(ml) 2 


0 —< 


Extended Data Figure 3 | Sighing after surgery and bilateral injection of —_ into preBOtC. There is no effect of saline injection (data are mean+s.d., 


saline into preB6tC. a, Example of a sigh in a breathing activity trace of a n=5, P=0.83 by paired t-test). c-f, Breathing activity trace as ina 
urethane anaesthetized rat after surgery as in Fig. 2a—c. V+, tidal volume; (but also showing airflow). Note stereotyped waveform of sighs (d-f). 
JDia, integrated diaphragm activity; Dia, raw diaphragm activity trace. Bars, 1 min (c), 1 s (d-f). 

b, Sigh rate before (control) and after (saline) bilateral saline injection 
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Extended Data Figure 4 | Effects on sighing in individual rats following —_ shown in Fig. 3c. k-o, Raster plot of sighs (upper) and instantaneous 


bilateral injection into preB6tC of NMB, GRP and both NMB/GRP. sigh rates (lower) before and after NMB/GRP injection for the five 
a-e, Raster plot of sighs (upper) and instantaneous sigh rates (lower) experiments (k-o) shown in Fig. 4i. Grey, injection period; arrowhead 
before and after NMB injection for the five experiments (a—e) shown in raster plots, maximum instantaneous sigh rate; numbers, basal (left) 
in Fig. 2d. f-j, Raster plot of sighs (upper) and instantaneous sigh rates and maximal instantaneous sigh rate (right) and fold induction 
(lower) before and after GRP injection for the five experiments (f-j) (in parentheses) after neuropeptide injection. 
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Extended Data Figure 5 | Effect of NMB and GRP on rhythmic activity of 
preBotC slice. a, Neuronal activity trace({XII, black; J preBotC population 
activity, grey) of preBotC slice containing 30 nM NMB, as in Fig. 2e. Note 
the extreme effect of NMB in which every burst (‘breath’) in the trace is a 
doublet (‘sigh, asterisk). Bar, 5 s. b, c, NMB increases the doublet rate by 
increasing the fraction of total events that are doublets (b) and decreasing 
the interval following a doublet (c). Data are mean +s.d., *P < 0.05 by 
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paired t-test, n= 7. d, e, GRP also increases the doublet rate by increasing 
the fraction of total events that are doublets (d) and decreasing the interval 
following a doublet (e). Data are mean + s.d., *P < 0.05 by paired t-test, 
n=9. Note that post-doublet intervals are significantly longer than 
post-burst intervals under all conditions, consistent with longer post-sigh 
apneas in vivo. 
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Extended Data Figure 6 | Effects on sighing in individual rats following 4 min; slide 30s) before and after injection of the GRPR antagonist RC3095 


bilateral injection of BIM23042, RC3095 and BIM23042/RC3095 for the four experiments shown in Fig. 3f. i-n, Raster plot of sighs (upper) 
into preBotC. a-d, Raster plot of sighs (upper) and binned sigh rates and binned sigh rates (lower; bin size 4 min; slide 30s) before and after 
(lower; bin size 4 min; slide 30s) before and after injection of the NMBR BIM23042 and RC3095 injection for the six experiments shown in Fig. 4j. 


antagonist BIM23042 for the four experiments shown in Fig. 2h. 


Grey, injection period; numbers, longest inter-sigh intervals (s, seconds) 


e-h, Raster plot of sighs (upper) and binned sigh rates (lower; bin size following injection. 
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Extended Data Figure 7 | Specificity of antagonists BIM23042 and 
RC3095 in preBStC slice. a, BIM 23042 (100 nM) blocks the effect of 
NMB (10nM), but not GRP (3 nM) in preBotC slices. Data are mean + s.d., 
*P < 0.05 by paired t-test, n = 7. b, RC3095 (100 nM) shows the opposite 
specificity, blocking the effect of GRP (3 nM), but not NMB (10 nM). Data 
are mean +s.d., *P < 0.05 by paired t-test, n= 9. 
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Extended Data Figure 8 | Expression of Grp in rodent brain. a, b, In situ 
hybridization of mouse brain slices as in Fig. 3a showing expression of Grp 
(purple) in parabrachial nucleus (PBN) (a) and nucleus tractus solitarius 


(NTS) (b). Scale bar, 200 1m. c-e, In situ hybridization of rat brain slices 


showing expression of Grp in PBN (c), NTS (d), RIN/pFRG (e). Scale bar, 


200\.m. f, Tiled image showing GRP-positive projection (red) from 
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RTN/pFRG region to preBotC region containing SST-positive neuron 
(green). Scale bar, 20 jim. g-i, Serial confocal optical sections (0.8 pm 
apart) through mouse preBotC stained for GRP (red) and SST (green) 
focusing on short segment of GRP-positive projection where a GRP puncta 
(red) directly abuts (arrowhead) an SST-positive neuron. Scale bar, 101m. 
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Extended Data Figure 9 | Effect of bombesin injection on sighing 
following BBN-SAP-induced ablation of NMBR-expressing and 
GRPR-expressing preB6tC neurons. a, b, 10 min plethysmography traces 
of a control rat (a) and a day 5 BBN-SAP injected rat (b) during eupneic 
breathing (left). Indicated parts (10s) of traces are expanded at right. Note 
presence of sighs with stereotyped waveform in control rat, and no sighs 
detectable in BBN-SAP injected rat. c, Sigh rate before (control) and after 
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10 ppg bombesin injection (BBN) into the cisterna magna of rats before 
BBN-SAP injection (WT) and at day 4 and day 6 after BBN-SAP injection 
(BBN-SAP) into the preBétC to ablate NMBR-expressing and GRPR- 
expressing neurons as in Fig. 5a, b. Values shown are mean + s.d. (WT, 
n= 10; BBN-SAP, n=7 for day 4 and n=5 for day 6), *P < 0.05 by paired 
t-test; n.s., not significant. 
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Cryo-EM structure of the yeast 
U4/U6.U5 tri-snRNP at 3.7 A resolution 


Thi Hoang Duong Neuyen!, Wojciech P. Galej’*, Xiao-chen Bai!, Chris Oubridge!, Andrew J. Newman!, 


Sjors H. W. Scheres! & Kiyoshi Nagai! 


U4/U6.U5 tri-snRNP represents a substantial part of the spliceosome before activation. A cryo-electron microscopy 
structure of Saccharomyces cerevisiae U4/U6.U5 tri-snRNP at 3.7 A resolution led to an essentially complete atomic model 
comprising 30 proteins plus U4/U6 and U5 small nuclear RNAs (snRNAs). The structure reveals striking interweaving 
interactions of the protein and RNA components, including extended polypeptides penetrating into subunit interfaces. 
The invariant ACAGAGA sequence of U6 snRNA, which base-pairs with the 5’-splice site during catalytic activation, 
forms a hairpin stabilized by Dibl and Prp8 while the adjacent nucleotides interact with the exon binding loop 1 of U5 
snRNA. Snull4 harbours GTP, but its putative catalytic histidine is held away from the ~-phosphate by hydrogen bonding 
to a tyrosine in the amino-terminal domain of Prp8. Mutation of this histidine to alanine has no detectable effect on yeast 
growth. The structure provides important new insights into the spliceosome activation process leading to the formation 


of the catalytic centre. 


Pre-messenger RNA splicing is catalysed by an intricate molecular 
machine called the spliceosome and proceeds by a two-step trans- 
esterification mechanism, analogous to group II intron self-splicing'. 
The spliceosome is assembled on pre-mRNA by the ordered addition 
of small nuclear ribonucleoprotein particles (snRNPs) and numerous 
proteins, including the nineteen complex (NTC) and the nineteen 
related (NTR) complex”. Initially U1 and U2 snRNPs recognize the 
pre-mRNA 5’-splice site (5’SS) and branch point (BP), respectively. 
Recruitment of U4/U6.U5 tri-snRNP produces the fully assembled 
but catalytically inactive complex B!. U1 snRNP is displaced from the 
5/SS by Prp28 (ref. 5), the 5/SS pairs with the ACAGAGA sequence 
in U6 snRNA, and Brr2 helicase unwinds the extensively base-paired 
U4/U6 snRNAs to release U4 snRNA with its associated proteins’. 
This allows U6 snRNA to base-pair with U2 snRNA generating the 
group II intron-like catalytic RNA core*"'!. The 2/OH group of the BP 
adenosine attacks the 5’SS, producing exon1 and lariat intron-exon2 
intermediates, and after further remodelling to complex C*, U5 snRNA 
loop 1 aligns exons 1 and 2 for the second trans-esterification'*'*. The 
spliced mRNA product is released and the residual intron lariat splice- 
osome (ILS) is disassembled, recycling the snRNPs for subsequent 
rounds of splicing’. 

Electron microscopic studies of spliceosomes at different stages of 
the splicing cycle revealed low-resolution pictures of these complexes". 
Taking advantage of the recent revolution in cryo-electron microscopy 
(cryo-EM) single particle analysis’*, we reported the organization 
of the proteins, and U5 snRNA and U4/U6 snRNAs in S. cerevisiae 
U4/U6.U5 tri-snRNP based on a cryo-EM map at 5.9 A (ref. 16). In 
Schizosaccharomyces pombe cell extract, ILS complexes containing U2, 
U5 and U6 snRNAs accumulate through inefficient disassembly’””*. 
The structure of this endogenous U2.U6.U5 spliceosomal complex was 
determined by single particle cryo-EM at 3.6 A resolution!®”°. This 
was an important breakthrough that revealed the overall architecture 
of a spliceosomal complex with the striking structures of NTC and 
NTR!°. The absence of spliced mRNA and step 2 factors* from this 
complex!’ confirms that it is the post-splicing ILS'®. The structure also 
revealed the important features of the well-established group II 


intron-like catalytic RNA core®’ remaining after spliced mRNA is 
released”°, 

Here we present an essentially complete atomic model of S. cerevisiae 
U4/U6.U5 tri-snRNP?!” based on a cryo-EM density map at 3.7A 
overall resolution, revealing the architectural and mechanistic princi- 
ples of spliceosome activation. 


Overall structure 

We collected a new data set on a Titan Krios microscope using the 
Gatan K2 Summit direct electron detector (Methods). The overall 
resolution of the tri-snRNP map was improved from 5.9 A to 3.7A 
(Extended Data Fig. 1). Using a modified masked refinement with sig- 
nal subtraction”’, we obtained more homogeneous 3.6, 3.7 and 4.2 A 
reconstructions for the body, foot and head domains, respectively, 
and improved the resolution of the arm domain from 10A to 6-7.5A 
(Extended Data Fig. 2). The new maps enabled us to build a near- 
complete atomic model of the yeast tri-snRNP containing 30 proteins, 
U4/U6 and U5 snRNAs (Fig. 1 and Supplementary Information), 
revealing an amazing web of interactions between components of the 
complex (Extended Data Fig. 3). 


Prp8s 

A complete atomic model of Prp8 is now built, except for the unstruc- 
tured N terminus and inter-domain linkers. The a-helix (aRT1) at 
the N terminus of the reverse transcriptase (RT) domain in the crys- 
tal structure?* extends further and forms a helix bundle (HB) with 
three additional long helices appended to the RT domain (Fig. 2a, b). 
Residues 108-733 form a predominantly «-helical N-terminal domain. 
Stems I and II of U5 snRNA are coaxially stacked’* and an extra variable 
stem protrudes from the three-way junction (Extended Data Fig. 4). 
A long, slightly bent C-terminal a-helix (residues 703-735) of the 
N-terminal domain fits into the minor groove of the co-axially stacked 
stems I and II, which is tightly harnessed in the major groove by a 
polypeptide loop (residues 535-543) protruding from the N-terminal 
domain (Fig. 2c). The conserved loop 1 of U5 snRNA, which aligns 
the exons during the second trans-esterification reaction!*}3, points 


IMRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, UK. 
*These authors contributed equally to this work. 
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Figure 1 | Three orthogonal views of a near-complete atomic model of 
the Saccharomyces cerevisiae U4/U6.U5 tri-snRNP. Inset shows four 
sub-domains. 


towards the most positively charged and conserved surface of Prp8 in 
the thumb/linker domain, part of the active site cavity**, The BP+2 
nucleotide cross-links in active spliceosomes between Prp8 residues 
1585-1598, on the cavity surface (C. M. Norman and A.J.N., unpub- 
lished observations). This region is disordered in the Prp8-Aar2 
complex”, whereas in U4/U6.U5 tri-snRNP it forms a helix-turn-helix 
(the a-finger) and contacts U54-U55 of U4 snRNA near the three-way 
junction (Fig. 2b). 

The 5’-stem-loop of U6 snRNA interacts with the N-terminal 
domain of Prp8 and the adjacent single-stranded region pairs with 
the exon binding U5 snRNA loop 1 (Fig. 2d). The small highly con- 
served protein Dib1 (ref. 25) binds to the helix bundle and «-finger 
of Prp8, and a long polypeptide of Prp31. U6 snRNA forms a short 
stem-loop, involving part of the ACAGAGA sequence, which is sand- 
wiched between Dib] and the Prp8 large domain (residues 1648-1653) 
(Fig. 2d, e; Extended Data Fig. 4a). 


Snull4 

We built a near complete atomic model of Snul114 comprising five 
domains (D1-D5) similar to EF-G/EF-2 (refs 26, 27). The relative 
arrangement of D1-D3 closely resembles that of EF-G/EF-2, whereas 
D4 and D5 pack more compactly (Fig. 3a). The guanine nucleotide 
density is consistent with GTP bound via canonical interactions with 
the surrounding residues (Fig. 3b; Extended Data Figs 3a and 5a-e). 
In most GTPases the glutamine residue in the switch 2 loop places a 
water molecule at the \-phosphate of GTP and hydrolyses the phos- 
phate ester*®. As in EF-Tu, EF-G and their eukaryotic counterparts, 
the catalytic glutamine residue is replaced by histidine in Snul14 
(ref. 26) (Extended Data Fig. 5e). In U4/U6.U5 tri-snRNP, His218 is 
hydrogen-bonded to Tyr403 of Prp8, preventing the His218 side chain 
from rotating towards the \-phosphate of GTP and hence keeping the 
GTPase inactive (Fig. 3c). In EF-G and EF-Tu, GTP is hydrolysed when 
this histidine is repositioned by a hydrogen-bond with a phosphate 
in the sarcin-ricin loop of the ribosome”””” (Fig. 3d). The extensive 
interactions between Snul14 and the N-terminal domain of Prp8 
are conserved between U4/U6.U5 tri-snRNP and the S. pombe ILS" 
(Extended Data Fig. 5f). The hydrogen bond between His218 of Snul14 
and Tyr403 of Prp8 is maintained by the equivalent residues in ILS” 
(Extended data Fig. 5d). The GTP binding site of Snu114 is at the inter- 
face with the N-terminal domain of Prp8, leaving insufficient room for 
U5 snRNA or any proteins to access the GTPase active site and act like 
the sarcin-ricin loop??*° or as a GTPase activating protein (GAP)”®. 
Since the structure suggests no obvious mechanism for Snul14 GTPase 
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Figure 2 | Prp8 and U4/U6 and U5 snRNAs. a, Domain structure 

of Prp8**. HB, helix bundle; RT, reverse transcriptase-like; Endo, 
endonuclease-like; RH, RNase H-like; JM, Jab1/MPN. b, Prp8 makes 
extensive interactions with U4/U6 and U5 snRNAs. c, The a-helix 
(residues 703-735) of the N-terminal domain fits into the minor groove 
of U5 snRNA and an extended polypeptide (residues 535-543) fits into 
the major groove on the opposite face, harnessing the RNA helix firmly 
in place. d, Orthogonal view. U5 snRNA loop 1 interacts with the single- 
stranded region of U6 snRNA adjacent to its 5’-stem-loop. e, The region 
around the ACAGAGA sequence forms a hairpin and is sandwiched 
between the large and N-terminal domains of Prp8 and Dib1. 
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Figure 3 | Snu114 and its interaction with guanine nucleotide, Prp8 and 
U5 snRNA. a, Snu114, the N-terminal domain of Prp8 and U5 snRNA 
stems I and I] form a stable domain in the foot domain in U4/U6.U5 
tri-snRNP. GTP is bound in the GTPase active site at the interface with 

the Prp8 N-terminal domain. b, Canonical interactions of GTP with 
surrounding residues in Snul114. c, The catalytic His218, hydrogen bonded 
to Tyr403 in Prp8, points away from the GTP 4-phosphate. d, Activation of 
EF-G GTPase upon binding to the sarcin-ricin (SR) loop in the ribosome. 
His87 moves closer to the 7-phosphate and places a water molecule”’. 
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Figure 4 | Interactions of U4/U6 snRNAs with proteins. a, Overview of 
U4/U6 di-snRNP. b, The extraordinary structure of Prp3 and its multiple 
interactions with U4/U6 snRNA, Prp4, Snu13, the RNase H-like domain 
of Prp8, Brr2 N-terminal domain and the LSm core domain. c, The 
C-terminal region of Prp31 extends along U4 snRNA 5/-stem towards the 


activation we investigated the function of Snul14 by mutagenesis. With 
the His218Arg mutation, yeast shows only a mild temperature-sensi- 
tive phenotype, confirming earlier results*! (Extended Data Fig. 5g), 
whereas the equivalent mutation in EF-Tu reduces cognate tRNA- 
induced GTPase activity 10°-fold**. Surprisingly, yeast containing the 
His218Ala mutant of Snul14 shows no apparent phenotype (Extended 
Data Fig. 5g), while the equivalent mutation in EF-Tu reduces the rate 
of GTP hydrolysis more than 10°-fold*”. Furthermore, mutations of 
Tyr403 (Tyr403Phe and Tyr403Ala) in Prp8, which hydrogen-bonds 
with His218 in Snul114, have no apparent phenotype (Extended Data 
Fig. 5h). These results raise the possibility that Snul14-bound GTP may 
not be hydrolysed during splicing. 

The guanine nucleotide in the post-splicing ILS is interpreted as 
GDP’, but its conformation is distinct from that of GDP in other 
GTPases (Extended Data Fig. 5a). In contrast, the conformation of 
the Snul14-bound GTP in tri-snRNP superimposes well with GTP or 
non-hydrolysable GTP analogues in other GTPases (Extended Data 
Fig. 5c). When we refined our structure of Snu114 with GDP the result- 
ing GDP conformation is similar to that observed in ILS (Extended 
Data Fig. 5b). The proposed guanine nucleotide-dependent regulatory 
role of Snu114 is based on the effects of XDP, XTP and a non-hydro- 
lysable XTP analogue on the XTP binding mutant (D271N) of Snul14 
(refs 33, 34). Mutations in the GTPase domain prevent the interaction 
of Snu114 with Prp8, blocking U5 snRNP assembly’. The observed 
effect of XDP, XTP and non-hydrolysable XTP may be due to XTP- 
induced stabilization and association of Snul14 mutants with Prp8. 


The U4/U6 di-snRNP 

The extensively base-paired U4/U6 snRNAs form a three-way helix 
junction (Extended Data Figs 4a, b). Snul3, bound to the k-turn 
motif*®, is wedged between the U4 5’-stem-loop and the U4/U6 snRNA 
helix II and packed against the Prp4 WD40 domain” (Fig. 4a). Prp3 
makes extensive interactions with the Prp4 WD40 domain, the basket 
handle-like structure and Snu13, and forms a long a-helix sitting in 
the minor groove of U4/U6 helix I. After forming a short «-helix, 
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three-way junction. d, The C-terminal extension of Prp31 makes multiple 

interactions with U4/U6 snRNAs, Dib1, Prp8 a-finger and the N-terminal 
extension of Prp6. e, The Prp4 WD40 domain and Prp31 interact with the 

C-terminal TPR domain of Prp6. 


Prp3 folds back to form a long a-helix binding across the major groove 
of U4/U6 helix II (Fig. 4b; Extended Data Fig. 3d). These latter two 
Prp3 helices and the connecting loop interact extensively with the 
RNase H-like domain of Prp8 and Brr2 N-terminal domain. Prp3 
further extends to form a ferredoxin-like domain*’, which packs against 
the Prp4 WD40 domain*”. Masked classification of the arm domain 
reveals extra density for the N terminus of Prp3 extending towards the 
LSm protein ring (Extended Data Fig. 6a, b). The 3’-end of U6 snRNA 
binds to the central hole of the LSm protein ring while the preceding 
single-stranded region binds to the ferredoxin-like domain of Prp3 
(ref. 38). The Nop and coiled-coil domains of Prp31 interact with 
Snu13, whereas the k-turn motif of U4 5’-stem-loop is sandwiched 
between Snu13 and Prp31 (refs 36, 39) (Fig. 4c). The extended poly- 
peptide chain of Prp31 runs between the phosphate backbone of U4 
5’-stem and Dib1, and forms a small domain together with Prp6 which 
is surrounded by the three-way RNA helix junction and the a-finger 
and helix bundle of Prp8 (Fig. 4d). 

The C terminus of the Prp6 TPR repeats” interacts with the Prp4 
WD40 domain, Snu13, Prp31 and the tip of U4 5/-stem-loop (Fig. 4e), 
while an extended N-terminal polypeptide of Prp6 packs against the 
RNase H-like domain of Prp8 and interacts with the small carboxy- 
terminal domain of Prp31, the Prp8 a-finger, and U4/U6 snRNA three- 
way junction and then wraps around the Prp8 helix bundle (Fig. 4d; 
Extended Data Fig. 3d-f). The numerous interactions that Prp6 makes 
with U4/U6 snRNP components and Prp8 reflect its importance for 
tri-snRNP assembly“’. 


Brr2 

The single-stranded region of U4 snRNA (Extended Data Figs 3b and 
4a), extending from stem I, enters the active site of Brr2 N-terminal 
helicase cassette near the strand-separating 8-hairpin and passes 
through the channel between the RecA1, RecA2, Ratchet and WH 
domains” (Extended Data Fig. 7a-c). The N-terminal domain (NTD) of 
Brr2 extends towards U4/U6 stem II and contacts the long helix of Prp3 
running along the phosphate backbone of U4 snRNA. Brr2 inserts a loop 
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Figure 5 | B complex formation and activation mechanism. a, U4/U6.U5 
tri-snRNP fits into the EM envelope of human complex B* (reproduced 
from ref. 45 with permission), showing that U2 snRNP binds near the 
LSm core domain, Prp6 and Prp3. b, Overlay of the Prp8 large domain 
between tri-snRNP and the ILS!” shows how NTC/NTR might bind to 
complex B and interact with U2 snRNP so that U2 snRNP can be passed 
to the NTC/NTR complex. c, A comparison of the tri-snRNP and the 
ILS!’ structures shows rotation of the foot domain with respect to the 
Prp8 large domain. Upon rotation, Prp8 residues 602-614 will clash with 
Dib1 and ACAGAGA helix, causing them to dissociate thus liberating the 
ACAGAGA sequence to bind the 5’-splice site. 


of the NTD into the minor groove of U4/U6 stem II (Extended Data 
Fig. 7b, d). These interactions may guide U4/U6 stem II during unwind- 
ing. Snu13, Prp4 WD40, Prp3 ferredoxin, and Prp31 Nop and coiled-coil 
domains assemble together while the long «-helices and stretched poly- 
peptide chains of Prp3 and Prp31 extend from these domains and inter- 
act with U4/U6 stem II and U4 5/-stem-loop, respectively*’. These long 
a-helices and extended polypeptides may function like elastic bands to 
accommodate conformational changes and partial strand separation of 
the U4/U6 duplex as Brr2 translocates along U4 snRNA and unwinds 
U4/U6 stem I (refs 16, 43). Brr2 forms a stable complex with the Jab1/ 
MPN domain of Prp8 (ref. 42), which is attached to the RNase H-like 
domain of Prp8 via a long flexible linker, enabling both Brr2 and U4/ 
U6 di-snRNP to detach from the main body of Prp8 during unwinding. 
The improved map of the head domain at 4.5-5 A resolution, 
obtained by masked refinement, enabled us to build most of the 
Snu66 structure as poly-Ala chains. Its N-terminal region forms a 
globular domain that interacts with Prp8 endonuclease-like and Brr2 
N-terminal ratchet domains. This is followed by a long helix wedged 
between Prp8 Jabl/MPN and Brr2 N-terminal HLH domains while its 
C terminus wraps around Brr2, forming extensive interactions with the 
Brr2 C-terminal cassette (Extended Data Fig. 7e), fully consistent with 
yeast two-hybrid and co-immunoprecipitation assays“. Interestingly, 
our global classification approach showed ‘oper and ‘closed’ confor- 
mations of the head and foot domains (Extended Data Fig. 6c-e). In 
the ‘closed’ conformation, the globular domain of Snu66 contacts the 
N-terminal domain of Prp8, which in turn interacts with Snul114. 


Insight into spliceosome activation 
A comparison of U4/U6.U5 tri-snRNP with B® and BAU1 com- 
plexes*® shows that U2 snRNP docks with tri-snRNP where the LSm 
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complex, Prp3 and Prp6 are located, while U1 snRNP sits on top of U2 
snRNP (Fig. 5a). The components of NTC/NTR are also detected by 
mass spectrometry in complex B?. We compared the structures of our 
tri-snRNP and the post-splicing ILS!® by overlaying the large domain 
of Prp8 together with Snul114 and the U5 core domain. This shows 
that NTC and NTR can associate with tri-sn RNP without clashing 
and contact U2 snRNP (Fig. 5a, b). In complex B, U2 snRNP inter- 
acts with U4/U6.U5 tri-snRNP**, but when NTC and NTR dock with 
tri-snRNP, U2 snRNP is passed to NTC and NTR, and U2 Sm domain 
and U2B"/U2A’ complex associate with Aquarius(Cwfl11), Syfl(Cwf3) 
and Isy1(Cwf12)* as revealed in the S. pombe ILS? (S. pombe protein 
names are shown in parentheses). 

The S. cerevisiae U4/U6.U5 tri-snRNP and S. pombe ILS structures 
reveal that the foot domains of the two structures, containing the Prp8 
N-terminal domain, U5 snRNA stem-loop | and Snu114, superpose 
very well showing that they form a stable structural unit (Extended 
Data Fig. 5f). Overlay of their Prp8 large domains shows that the foot 
domain rotates as a rigid body by 30° between the two structures, 
causing U5 loop 1 to move closer towards the Prp8 «-finger in the 
post-splicing ILS’? (Fig. 5c). NTC forms extensive interfaces with both 
the N-terminal and large domains of Prp8, hence the rotation of the 
foot domain may be caused by NTC. When the foot domain of the 
U4/U6.U5 tri-snRNP structure rotates by 30° (as in the post-splicing ILS) 
Prp8 residues 602-614 clash with Dib1 and the ACAGAGA helix, forc- 
ing Dib] to dissociate from the large domain of Prp8 and liberating the 
ACAGAGA sequence to bind the 5’SS. It is known from the U4 snRNA 
cs1 mutation and its suppressor in U6 snRNA (U6-Dup)* that the pair- 
ing between 5’/SS and the ACAGAGA sequence is a checkpoint for the 
unwinding of the U4/U6 snRNA duplex by Brr2. Thus, conformational 
toggling of the Prp8 N-terminal domain could couple 5’SS recognition 
to U4/U6 unwinding. Suppressors of the U4-cs1 mutation suggest 
the allosteric changes required for Brr2 activation. Interestingly, these 
suppressors form four clusters in Prp8: one at the interface between 
the RT domain and D3 of Snu114, one at the interface between the 
helix bundle and Prp31, one on the surface of the endonuclease-like 
domain near the ACAGAGA hairpin, and one on the surface of the 
N-terminal domain where the 5’-stem-loop of U6 snRNA binds, near 
the interface with Snu66 that undergoes a transition between the open 
and closed forms (Extended Data Fig. 6c-e). It is possible that NTC/ 
NTR play important roles in inducing allosteric changes that trigger 
the unwinding of U4/U6 snRNA by Brr2 (ref. 50). 

The cryo-EM structures of S. cerevisiae U4/U6.U5 tri-snRNP and 
S. pombe ILS!’ have provided a wealth of new information about the 
architecture and conformational changes of these spliceosomal assem- 
blies. Functional studies based on these new structural insights should 
greatly enhance our understanding of spliceosome activation and 
catalysis. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Statistics. No statistical methods were used to predetermine sample size. 
Sample preparation. Tri-snRNP sample was prepared as described in our pub- 
lished protocol!®. 

Electron microscopy. Aliquots of 3.5.1 of purified yeast tri-snRNP were applied 
to Quantifoil Cu R1.2/1.3, 400 mesh grids, which were coated with 6 nm-thick 
homemade carbon film and glow-discharged in N-amylamine. The grids were 
blotted for 2 s at 4 °C, plunged into liquid ethane by an FEI Vitrobot MKIII at 
100% humidity and loaded onto a Titan Krios transmission electron microscope 
operated at 300 kV. Zero-loss-energy images were collected manually on a Gatan 
K2-Summit detector in super-resolution counting mode at a calibrated magnifica- 
tion of 35,714 (pixel size of 1.43 A) anda dose rate of ~2.5 electrons per A” per 
second (Extended Data Fig. 1a). We used a slit width of 20eV on a GIF Quantum 
energy filter. Each image was exposed for a total of 16s and dose-fractionated into 
20 movie frames. A defocus range of 0.5-3.5,1m was used. 

Image processing. MOTIONCORR?®! was used for whole-image drift correction 
of the movie frames of each micrograph, and contrast transfer function (CTF) 
parameters of the corrected micrographs were estimated using CTFFIND4 
(ref. 52). All subsequent processing steps were done using RELION* unless oth- 
erwise stated. A subset of ~5,000 particles was picked manually, extracted using 
a 380? pixel box and subjected to reference-free 2D classification. Some of the 
resulting 2D class averages were low-pass filtered to 20 A and used as references 
for automatic particle picking of the whole data set of 2,477 micrographs. The 
automatically picked particles were screened manually to remove false positives, 
aggregation and ice contamination, resulting in an initial set of 473,827 particles 
for reference-free 2D classification. We selected 438,602 particles from good 2D 
classes for the 3D classification (Extended Data Fig. 1b, c), which was run for 
25 iterations, using an angular sampling of 7.5°, a regularisation parameter T of 
4 and a 60A low-pass filtered initial model from our previous reconstruction’®. 
A subset of 140,155 particles was selected for the first 3D auto-refinement. Particle- 
based beam-induced motion correction and radiation-damage weighing (particle 
polishing) were performed on these particles™. Auto-refinement of the polished 
particles resulted in a reconstruction at 3.7 A overall resolution with an estimated 
angular accuracy of 1.1°. 

Local resolution analysis by Resmap* showed a range of resolution from 3.0 A 
in the core to 10 A in the arm domain and part of the head domain, indicating 
conformational heterogeneity within the complex. As previously observed, the 
four domains of the structure, particularly the head and arm domains, are flexible 
in our structure. We employed two classification/refinement approaches: a local 
approach to improve the local resolution of the domains and a global approach to 
allow global conformations of the domains relative to one another to be observed 
(Extended Data Fig. 1c). For the local approach, we used a masked refinement 
procedure with signal subtraction for each of the head, body and foot domains” 
and a masked classification with signal subtraction followed by a masked refine- 
ment for the most flexible arm domain”>. Each of the four domains only makes 
up a third or less of the total mass of the complex. For each domain, we subtracted 
projections from the remaining three domains of the reconstruction in the exper- 
imental particle images using the relative orientation of each experimental image 
from the last auto-refinement run of all the polished particles. This resulted in four 
sets of new experimental particle images that only have signal from the domain of 
interest. For the body, foot and head domains the subtracted experimental images 
were used in 3D auto-refinement with a soft mask for that domain, yielding 3.6, 
3.7 and 4.2 A reconstructions for the body, foot and head domains, respectively 
(Extended Data Figs 1b, 2a—c). The arm domain is too small for masked refine- 
ment. Thus we performed 3D classification on the subtracted experimental images 
with a mask around the arm domain and no alignments. We selected three classes 
with 23,760, 26,367 and 24,627 particles each with three distinct conformations for 
the arm domain. Since the arm domain is too small for accurate alignments of the 
particles, we refined each of these classes together with the body domain using a 
new set of modified experimental particle images that included both the arm and 
body domains by the same subtraction method used previously (Extended Data 
Figs 1c and 6a, b). Class 1 with 23,760 particles yielded a 4.6A overall resolution 
for both body and arm domains and 6.2 A resolution for the arm domain alone. 
Class 2 with 26,363 particles yielded a 4.5 A overall resolution for both body and 
arm domains and 7.5 A resolution for the arm domain alone. Class 3 with 24,627 
particles yielded a 4.4 A overall resolution for both body and arm domains and 
6.2 A resolution for the arm domain alone. 

For the global classification/refinement approach, we performed 3D classifica- 
tion of the polished particles for the whole complex with a finer angular sampling 
of 1.8° and local angular search range of 10°. Two of the sub-classes of 48,945 and 
36,824 particles had significantly better angular accuracies and gave 4.2 A and 
4.3 A reconstructions, respectively, after auto-refinement with more homogeneous 
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conformations of the head and foot domains. We observed distinct “open” and 
“closed” conformations for the head and foot domains (Extended Data Fig. 6c-e). 
All reported resolutions are based on the gold-standard Fourier shell correlation 
(FSC) = 0.143 criterion®®. FSC curves were calculated using soft spherical masks 
and high-resolution noise substitution was used to correct for convolution effects 
of the masks on the FSC curves”’. Prior to visualization, all maps were corrected 
for the modulation transfer function of the detector. Local resolution was estimated 
using Resmap”’. 
Model building. The maps resulting from local masked refinements were first used 
for de novo model building using our previous protein placements’ because they 
have the best local resolution for each of the domains separately. All model building 
was performed in COOT™®. In our medium resolution structure, except for Brr2*?-7?, 
Prps**>-2413, Snul13 and LSm proteins whose yeast structures are available, all other 
proteins are either human, homology models or idealized poly-Ala helices and 
only double-stranded RNA helices were modelled. Recently the structure of the 
ferredoxin domain of yeast Prp3 (residues 335-467) became available*®, which 
replaced our homology model of this domain. We rebuilt and extended the yeast 
Brr2“4?-7163, Snul13, Prp8®*>-743 and Prp3**>-” and all remaining components 
were built de novo first into the masked refinement maps and rigid-body fitted 
into the overall 3.7 A map. We identified a previously unassigned density as that 
of Snu66 based on previous yeast two-hybrid studies“! and its interacting proteins 
in our structure. The LSm proteins were rigid-body fitted into the overall map and 
the improved maps of the 3 classes from masked classification and refinement. 
Extended Data Table 1 summarizes all modelled components of the structure. The 
model was refined using REFMAC 5.8 (ref. 59) with secondary structure restraints 
provided by PROSMART™® and RNA base-pair and stacking restraints provided 
by LIBG®!. We first performed model refinement for the Body, Foot and Head 
domains separately against the corresponding masked refined maps (Extended 
Data Table 2a). The subunits of these three refined models were rigid-body fitted 
into the masked overall map. To resolve the possible clashes in the domain inter- 
faces, we refined this overall model against the overall map. Cross-validation of two 
half maps defined a REFMAC refinement weight of 0.001. The Xmipp package” 
was used to calculate FSC model versus map. FSC curves of model versus map 
were calculated for the maps of the body, foot and head domains, which were 
used for model building and refinement of the structure and also the overall map 
(Extended Data Fig. 2d-g). Extended Data Table 2 summarizes refinement statistics 
for the overall structure and the domain structures and the deposited maps and 
their associated coordinates. 
Map visualization. Maps were visualized in Chimera® and all figures were pre- 
pared using either Pymol (http://www.pymol.org) or Chimera. 
Plasmid Shuffling. Mutations were introduced into PRP8 and SNU114 genes by 
the dut~ ung” methods“. The viability of Prp8 and Snu114 mutants was assessed 
by plasmid shuffling analysis. The Prp8 deletion strain SC261A8B1 (ref. 65) car- 
rying wild type PRP8 on pRS316 (URA3, centromeric replication origin) was 
transformed with mutant Prp8 genes on pRS314 (TRP1, centromeric replication 
origin) and transformants were selected on plates lacking tryptophan. The Snul 14 
deletion strain YSNU114KO1 (ref. 66) carrying wild type SNU114 on pRS416 
(URA3, centromeric replication origin) was transformed with mutant Snu114 
genes on pRS413 (HIS3, centromeric replication origin) and transformants 
were selected on plates lacking histidine. Trp+ and His+ cells were transferred 
onto plates containing 5-fluoro-orotic acid (5-FOA), to test cell growth at 
30 °C after loss of the URA3-marked plasmid. Plasmids were rescued from the 
5-FOA-resistant strains and sequenced to confirm the presence of the appropri- 
ate mutation, and cell growth was assessed on YEPD plates incubated at various 
temperatures. 
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Extended Data Figure 1 | Image processing procedures. a, Representative micrograph. b, Representative 2D class averages obtained from reference-free 
2D classification. c, Classification and refinement procedures used in this study. 
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of model refinement by half-maps for the body, foot, head and overall 
maps, respectively. The red curves show FSC between the atomic model 
and the half-map it was refined against (half1) and the blue curves show 
FSC between the atomic model and the other half-map (half2) it was not 
refined against. The black curves show FSC between the atomic model and 
the sum map which the model was refined against. 
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Extended Data Figure 3 | Representative EM density for different shows the density in the RNA binding tunnel. c, Density for Prp8 large 
components of the map. a, Snu114 in the Foot domain with abound GTP —_ and RNase-like domains. The inset shows the density in the core of 
(magenta). The inset shows the GTP-binding pocket. b, Brr2 in the head Prp8. d-f, Prp3, Prp31 and Prpé6 densities, respectively, with extended 
domain with a bound single-stranded region of U4 snRNA. The inset polypeptides. 
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Extended Data Figure 4 | Secondary structure of the snRNAs in 
tri-snRNP. a, U4/U6 snRNA; c, U5 snRNA. The coloured nucleotides with 
red, green and blue background were built de novo into our EM density. 


yGAACCGUUUUACAAAGAGAUUUAUUUCGUUUU-3” 


U6 snRNA 


100 0 


The region near the ACAGAGA sequence of U6 snRNA forms a stem-loop 


that was not predicted previously. b, d, Representative EM density for 
U4/U6 snRNA duplex and U5 snRNA, respectively. 
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Extended Data Figure 5 | See next page for figure caption. 
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Extended Data Figure 5 | Interactions of Snu114 with guanine 
nucleotides and the N-terminal domain of Prp8 in the S. cerevisiae 
U4/U6.U5 tri-snRNP and S. pombe ILS complexes. a, Conformation of 
the Snul14(Cwf10)-bound GDP refined in the S. pombe ILS spliceosomal 
complex!?° (red, PDB 3JB9), was overlaid on GDPs found in other 
guanine-nucleotide binding proteins (grey, PDB coordinates: 1DAR, 2E1R, 
2WRI, 1Z0I, 5CA8, 1XTQ, 4YLG, 1SF8, 5BXQ). b, Guanine nucleotide 
refined as GDP in Snul114 of the S. cerevisiae U4/U6.U5 tri-snRNP (blue) 
is overlaid on GDPs found in the PDB coordinates as in a. c, Conformation 
of guanine nucleotide refined as GTP in Snu114 of the S. cerevisiae 
U4/U6.US tri-snRNP (blue) agrees well with GTP or GTP analogues in 
other guanine-nucleotide binding proteins (PDB code: 2BV3, 2DY1, 2J7K, 
4YW9, 1ASO, 1LFO (grey)). d, Superposition of the active site of 


Snu114-GTP and Cwf10-GDP. e, Superposition of the GDP-bound EF-G 
(2WRI), GMP-PCP bound EF-G (4JUW) and Snu114 (S. cerevisiae 
tri-snRNP) active sites. His218 (His87 in EF-G) positions water molecule 
crucial for GTP hydrolysis. f, Comparison of PrpgNterm domain, Snul14 
and U5 snRNA in the S. cerevisiae U4/U6.U5 complex and S. pombe ILS 
complex. g, Growth of serial dilutions of yeast strains carrying wild-type 
Snul14, His218Arg or His218Ala Snu114 mutants at different temperatures. 
Cells were spotted on YPD plates and grown at 14 °C for 10 days, 30 °C 
and 37 °C for 2 days. h, Growth of serial dilutions of yeast strains carrying 
wild-type Prp8, Tyr403Phe and Tyr403Ala mutants. Cells were spotted 
on YPD plates and grown at 14 °C for 9 days, 30 °C for 3 days. This yeast 
strain does not survive at 37 °C and thus is not shown. 
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Extended Data Figure 6 | Conformational flexibility of tri-snRNP 
observed by classification. a, Different conformations of the arm domain 
demonstrated by the unsharpened maps of the three major classes (purple, 
magenta and red) obtained from masked classification of the arm domain 
alone followed by masked refinement with the body and arm domains. 
The body domain was included in the refinement because the arm domain 
is too small for accurate alignments. b, The sharpened map of one of the 
three classes with Prp3 and LSm models shown. In the improved domain 
maps for the arm domain, extra density for the N-terminal helix of Prp3 


ARTICLE 


Prp3-NTE 


could be observed to extend to the LSm proteins. c, The sharpened map 

of the tri-snRNP and the locations of Snu66 and Prp8. d, The open and 
closed conformations of the head and foot domains of the tri-snRNP 
observed by global classification. The unsharpened maps for the two major 
classes obtained from global classification with finer angular sampling 
(1.8°) followed by 3D auto-refinement are shown. The open and closed 
states are indicated. e, Superposition of the unsharpened maps of the 

open (grey) and closed (yellow) states shown in d. The arrows indicate the 
rotations of the head and foot domains. 
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Extended Data Figure 7 | Brr2 helicase and its U4/U6 snRNA substrate. extending from stem I enters the active site near the B-finger (red). 


a, Domain structure of Brr2 helicase comprising the N-terminal domain c, 3’ stem of U4 snRNA interacts with the HLH domain of NHC. d, The 
and two helicase cassettes. Individual domains of N-terminal helicase N-terminal domain (NTD) of Brr2 interacts with a long helix of Prp3 and 
cassette (NHC) are colour-coded. b, Extensive interactions of Brr2 with inserts a loop into U4/U6 Stem II. e, Snu66 has a long extended region that 
U4/U6 snRNA and Prp3. The single-stranded region of U4 snRNA wraps around both helicase cassettes of Brr2. 
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Extended Data Table 1 | Summary of model building of tri-snRNP components 


Human/S. pombe 


protein total residues M.W. Modeled Chain name Local map 
names 


108-735: Foot 
Prp8 2413 279,299 110-2401 A 751-2104: Body 220K/Spp42 
2147-2401: Head 
363-433: Body 


Brr2 2163 246,125 364-2163 B 200K/Brr2 
439-2163: Head 


Snu114 1008 114,025 102-989 C Foot 116K/Cwf10 
Dib1 143 16,774 2-137 D Body 15K/Dim1 
JE SaLINE SmB 196 22,403 4-102 D SmB/SmB 
SmD3 110 11,229 1-109 d SmD3/SmD3 
SmD1 146 16,288 15-108 h SmD1/SmD1 
SmD2 110 12,856 4-85 i Foot SmD2/SmD2 
SmE 94 10,373 4-92 e SmE/SmE 
SmF 96 9,659 12-83 f SmF/SmF 
SmG 77 8,479 2-76 g SmG/SmG 
4-173: Foot 
U5 snRNA-L 214 68,847 4-173 
88-107: Body 
Snu13 126 13,570 3-126 K Body 15.5K/Snu13 
Prp31 494 56,305 43-457 F Body 61K/Prp31 
Prp3 469 55,877 150-467 G Body 90K/Prp3 
Prp4 465 52,425 109-465 H Body 60K/Rna4 
SmB 196 22,403 4-102 k SmB/SmB 
SmD1 146 16,288 1-118 | SmD1/SmD1 
SmD2 110 12,856 15-108 m SmD2/SmD2 
SmD3 110 11,229 4-85 n Head SmD3/SmD3 
SmE 94 10,373 10-92 p SmE/SmE 
SmF 96 9,659 12-83 q SmF/SmF 
U4/U6 snRNP SmG 77 8,479 2-76 r SmG/SmG 
Lsm2 95 11,164 1-90 2 Lsm2/Lsm2 
Lsm3 89 10,020 3-79 3 Lsm3/Lsm3 
Lsm4 172 20,304 1-90 4 Lsm4/Lsm4 
Lsm5 93 10,415 4-84 5 Arm Lsm5/Lsm5 
Lsm6 86 9,396 11-84 6 Lsm6/Lsm6 
Lsm7 115 13,010 26-105 7 Lsm7/Lsm7 
Lsm8s& 109 12,385 1-67 8 Lsm8/Lsm8 
1-67: Body 
U4 snRNA 160 51,390 1-152 V 
73-152: Head 
1-26: Foot 
U6 snRNA 112 36,088 1-112 W 26-88: Body 
108-112: Arm 
Prp6 899 104,234 155-898 J Body 102K/Prp1 
Snu66 587 66,426 5-560 (poly-Ala) E Head 110K/Snu66 
tri-snRNP specific Prp38 242 27,957 Not modeled hPrp38/Prp38 
Snu23 194 22,682 Not modeled hSnu23/Snu23 
S$pp381 291 33,764 Not modeled 


© 2016 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


Extended Data Table 2 | Refinement, model statistics and structure/map depositions 


a. Statistics of tri-snRNP structure determination 


EM Titan Krios 300kV, K2 Gatan Summit 


Defocus range (um) -0.5 to -3.5 


Accuracy of rotations (°) 1.13 5 1 


— 
ua 
a 
(oe) 
N 
IN 
ine) 


o 
(op) 
ow 
N 
a 
ine) 


Final resolution (A) 3.7 


Refinement weight 0.001 0.001 0.001 0.001 


Residue numbers 9325 3728 2186 2922 


R-factor (%) 29.7 27.8 


ye) 
© 
N 
wo 
=k 
ol 


Rms bond angle (°) 1.27 1.33 1.38 1.40 


Favoured 8066 (91.4%) 3266 (91.9%)  1810(90.9%) 2531 (89.3%) 


Outliers 152 (1.7%) 50 (1.4%) 37 (1.9%) 39 (1.4%) 


Geometry score (percentile) 2.52 (98 ) 2.41 (99) 2.79 (95 ) 2.62 (97 ) 


Good rotamer (%) 94.8 95.7 93.5 93.2 


b. Deposited maps and associated coordinate files 


Overall map EMD-8012 5GAN 


Head map EMD-8013 5GAO 


Global class 1 (closed state) EMD-8007 


Masked body/arm class 1 EMD-8008 


Masked body/arm class 3 EMD-8010 
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Super-catastrophic disruption of asteroids at small 


perihelion distances 


Mikael Granvik!?, Alessandro Morbidelli*, Robert Jedicke*, Bryce Bolin?*+, William FE. Bottke°, Edward Beshore®, 


David Vokrouhlicky’, Marco DelbO? & Patrick Michel? 


Most near-Earth objects came from the asteroid belt and drifted 
via non-gravitational thermal forces into resonant escape routes 
that, in turn, pushed them onto planet-crossing orbits!">. Models 
predict that numerous asteroids should be found on orbits that 
closely approach the Sun, but few have been seen. In addition, even 
though the near-Earth-object population in general is an even 
mix of low-albedo (less than ten per cent of incident radiation is 
reflected) and high-albedo (more than ten per cent of incident 
radiation is reflected) asteroids, the characterized asteroids near 
the Sun typically have high albedos*. Here we report a quantitative 
comparison of actual asteroid detections and a near-Earth-object 
model (which accounts for observational selection effects). We 
conclude that the deficit of low-albedo objects near the Sun arises 
from the super-catastrophic breakup (that is, almost complete 
disintegration) of a substantial fraction of asteroids when they 
achieve perihelion distances of a few tens of solar radii. The distance 
at which destruction occurs is greater for smaller asteroids, and their 
temperatures during perihelion passages are too low for evaporation 
to explain their disappearance. Although both bright and dark 
(high- and low-albedo) asteroids eventually break up, we find that 
low-albedo asteroids are more likely to be destroyed farther from the 
Sun, which explains the apparent excess of high-albedo near-Earth 
objects and suggests that low-albedo asteroids break up more easily 
as a result of thermal effects. 

Most near-Earth-object (NEO) discoveries during the past decade 
have been made by the Catalina Sky Survey (CSS), a combination of 
two distinct observatories with complementary capabilities. The 1.5-m 
Mt Lemmon telescope (code G96) typically detects faint NEOs close 
to the ecliptic, whereas the 0.8-m Catalina telescope (code 703) covers 
a larger area of the sky but focuses on brighter targets. From 2005 to 
2012, CSS made 7,952 serendipitious detections of 3,632 distinct NEOs 
with absolute magnitudes H ranging from 17 to 25 during nights that 
had well established estimates for the detection efficiency. The detec- 
tions were roughly equally shared between the two telescopes and the 
detected NEOs provide an extensive coverage of the parameter space 
(Supplementary Fig. 1). 

We use the CSS NEO detections to constrain a model describ- 
ing the true number of NEOs, N(a, e, i, H), as a function of orbital 
parameters (semimajor axis a, eccentricity e and inclination i) 
and absolute magnitude H, a proxy for the physical size. Our new 
model of the NEO population is based on the methodology of 
ref. 3. First, we tracked the dynamical evolution of test asteroids 
from seven source regions or escape routes, s, in the main asteroid 
belt and nearby small-body reservoirs all the way into the inner 
Solar System (Supplementary Figs 2 and 3). Next, the orbital 
pathways followed by the bodies were assembled into source- 
specific steady-state orbital distributions, R,(a, e, i) (Supplementary 
Fig. 4). These functions were then multiplied by source-specific 


parametric absolute-magnitude distributions, N,(H), and added 
together to produce N(a, e, i, H). The model was then multiplied by 
the computed? observational selection effects of CSS, B(a, e, i, H) 
(Supplementary Fig. 5), thus obtaining a biased NEO model N(a, e, i, H) x 
B(a, e, i, H). The free parameters describing the N,(H) functions were 
determined by best-fitting the biased NEO model to the distribution of 
thousands of NEO detections from the CSS, n(a, e, i, H), thus yielding 
anew and improved NEO population model. These computations are 
described in greater detail in Supplementary Information. 

The observed a, e, iand H distributions generally agree with the 
biased NEO model (Supplementary Fig. 6). The q=a(1 — e) distribu- 
tion reveals a systematic offset in that the model predicts too many 
NEOs with small q and too few with large q (Fig. 1). If we assume that 
the overprediction at q <0.6 astronomical units (AU) is the real source 
for the discrepancy, then the underprediction at large q is a feature 
resulting from how we fit the model to the data: the absolute number 
of NEO detections is one of the constraints and therefore the method 
compensates for an overprediction at small q with an underprediction 
at large q. The discrepancy in the q distribution has not been noticed 
in the past because previous models were calibrated with much smaller 
samples of NEOs and the discrepancy was not statistically significant®. 

A possible explanation for the discrepancy is that the combination 
of orbital steady-state distributions R,(a, e, i) is not flexible enough to 
allow for a good fit at small q. To test this explanation we divided the 
test asteroids into a larger number N, of source regions and re-did the 
analysis. It turns out that even a model with N, = 24 (Supplementary 
Fig. 7) is unable to match the observed q distribution (Supplementary 
Fig. 8). To rule out systematic problems with our orbital steady-state 
distributions we also tried orbital steady-state distributions computed by 
others using different starting conditions and integration parameters’. 
The alternative orbital distributions resulted in an even worse fit to 
the observed data, at least partly explained by the smaller number of 
source regions considered (Supplementary Fig. 9), and do not solve the 
discrepancy in q (Supplementary Fig. 8). 

To validate the adopted methodology and, in particular, the bias 
correction, we carried out two tests which made use of the fact 
that CSS is composed of two surveys, G96 and 703, with partly 
complementary capabilities. The first test was to fit separately to G96 
and to 703 to identify problems in either one of the surveys or their 
estimated selection effects. The discrepancy in q is present in both 
cases (Supplementary Fig. 8) and we conclude that the discrepancy 
in q is not specific to the chosen survey. The second test was a cross- 
check of the results: we estimated model parameters by fitting just 
the data from 703 (G96) in the limited range 0.7 au<q< 1.3 au, 
used the resulting model and estimates for selection effects to pre- 
dict the absolute number of detections for G96 (703) in the same q 
range, and compared the prediction to the data. The results show 
that the adopted methodology and bias corrections are sound 
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Figure 1 | Observed and predicted perihelion distances for NEOs 
detected by the CSS during 2005-2012. The observed perihelion distance 
distribution (black) compared with the biased model predictions with 
(green) and without (red), assuming a disruption at g, = 0.076 au. a, The 
observed and predicted number of NEO detections. b, The ratio between 
the observed and predicted number of NEO detections. The model 
without disruption shows a systematic overprediction at small q, whereas 
assuming a disruption breaks the trend and reproduces the observed q 
distribution. The s.e.m. error bars are computed assuming Poisson 
statistics. 


and that the simultaneous use of complementary data sets leads 
to improved and more accurate models (Supplementary Fig. 10). 
Extrapolations to q < 0.7 Au systematically show that CSS should have 
discovered more NEOs on orbits with small q if such objects existed 
(Supplementary Fig. 8). We therefore conclude that some physical 
mechanism must be reducing the number of NEOs at small q. 

We propose that when NEOs reach some critical perihelion dis- 
tance, qx, they catastrophically disrupt. To test the proposed mecha- 
nism we constructed alternative orbital steady-state distributions by 
considering the dynamical evolution of test asteroids only until their 
q becomes smaller than a pre-defined value qx. The observed q dis- 
tribution is best reproduced with q, = (0.0760 + 0.0025) 
AU (16 + 0.5)Ro (Fig. 1). Also the observed (a, e, i, H) distributions 
are accurately reproduced (Supplementary Fig. 11). The best-fit 
model predicts that there are (7.32 + 1.33) x 10° NEOs with 
17<H< 25 and 1,008 +45 NEOs with H < 17.75. Both numbers 
agree with the most recent estimates* (Supplementary Fig. 12). The 
best-fit model also reproduces the observed relative fractions of Amor 
(1.017 au<q<1.3 av), Apollo (a > 1 au and q <1.017 av) and Aten 
(a <1 au and aphelion distance Q > 0.983 av) asteroids with 
17 <H<17.5—the observed fractions are 47%, 50% and 3%, whereas 
the model predicts 43 + 5%, 53 +5% and 3.5 + 0.6%, respectively. See 
the Supplementary Video for an animation of how the orbit distribu- 
tion changes with H. 
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To assess the effect of size on 4» we divided the NEOs detected by 
CSS into three different groups as a function of H, and fitted 7, sepa- 
rately to each of these groups. The result shows a clear trend of increas- 
ing 7, with increasing H (Fig. 2), that is, an inverse correlation between 
q, and physical size. A direct consequence of the inverse correlation is 
that a kilometre-scale asteroid has to disrupt into fragments smaller 
than a few tens of metres in a single event or through a disruption 
cascade, depending on the disruption mechanism. The disruption dis- 
tances are too large to be explained by tidal effects and evaporation’. 
While the average surface temperature of the sunlit hemisphere on 
mid-sized NEOs may surpass 900 K, the resulting diurnal heat waves 
will penetrate’ only to depths of some tens of centimetres. Silicates 
immediately below the surface layer will therefore remain solid. This 
reveals that the actual disruption mechanism, although clearly related 
to temperature, is not trivial. A possibility is that rocks break into small 
grains by thermal cracking"! and the grains are then blown away from 
the asteroid by radiation pressure'*. Another possibility is that the 
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Figure 2 | Average disruption distance as a function of absolute 
magnitude and an asteroid’s physical size. We split the NEO detections 
by CSS into three different H groups, with the H range shown by the 
horizontal error bars (17 < H < 19 contains 3,326 detections, 20 < H < 22 
contains 1,669 detections, and 23 < H < 25 contains 913 detections). The 
average dynamical lifetime of NEOs is a few million years and asteroids on 
circular orbits with r=q, above the purple line will not have evaporated? 
after 4.5 billion years, the age of the Solar System. Evaporation can thus not 
explain the disappearance of small and mid-sized NEOs and, given the 
timescales, it is also an unlikely disruption mechanism for large NEOs. 
The brown horizontal line marks the Sun’s Roche limit for a hypothetical 
fluid comet with a density of 0.5 gcm~°, and serves as an approximate 
upper limit for tidal disruption of asteroids and comets. The red dashed 
lines correspond to the equilibrium temperature, T.4=[(1 — A)Lo/ 
(16neor’)]°”>, at perihelion, r=q, when assuming a Bond albedo of 
A=0.07 and an infrared emissivity of ¢ = 0.9. Ly = 3.827 x 10°° W is the 
solar luminosity and o = 5.6704 x 10°Wm °K ‘is the Stefan—-Boltzmann 
constant. The blue dashed lines correspond to the simple estimate of the 
temperature average over a sunlit hemisphere T,, = 4Tj./5 with the subsolar 
temperature T,,= 2 Teg. The true average surface temperature will lie 
somewhere between Tz, and Tay, because T.g does not allow for local 
temperature variations and T,, does not account for conduction and 
sublimation. The conversion between H and diameter assumes a geometric 
albedo of 0.15. The detection-weighted average 7, of the three groups is 
0.094 + 0.010 au, which is about 24 + 17% larger than the value obtained 
by fitting all groups simultaneously. The difference is a systematic error 
resulting from averaging over H. The line connecting the three groups 
emphasizes the linear (nonlinear) relation between the g, and H 
(diameter). The s.d. error bars on 7, were estimated by generating 

50 random representations of the best-fit model and re-fitting for q,. 

We required that the solutions for 7, must reproduce the observed q 
distribution: that is, all 7, that were substantially larger than the smallest 
observed q were discarded. 
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Figure 3 | Empirical distribution functions in perihelion distance for 
low-albedo and high-albedo NEOs detected by WISE. The WISE data set 
is biased against high-albedo NEOs: for a given absolute magnitude H, the 
higher the geometric albedo, the fainter the infrared apparent magnitude 
W3. The WISE detection efficiency is close to 100% up to W3 =9.5, drops 
to 50% at W3 = 10 and is close to zero beyond” W3 = 10.5. This means 
that asteroids with large enough albedos would not have been observed 
by WISE. To correct for the WISE albedo bias we assume that its limiting 
magnitude is W3}im = 10. Then, for each NEO with a WISE-determined 
albedo we identify all its WISE-reported observations at different epochs, 
select the smallest apparent magnitude W3,,in of all reported values, and 
reject that NEO from consideration if W3min > W31im. This reduces the 
initial number of 394 NEOs to 326 with W3}im= 10 and H> 15, the latter 
requirement ensuring that the albedos correspond to relatively small 
NEOs. The Anderson-Darling test applied to empirical distribution 
functions in q shows that it is extremely unlikely that the 133 low- 

albedo and 193 high-albedo NEOs have a common parent distribution 

(p © 0.0003) when accounting for gq < 1.3 au. A common origin for the 

q distributions becomes reasonable when limiting the analysis to NEOs 
with 0.6 Au < q< 1.3 au only, that is, 119 low-albedo and 140 high-albedo 
NEOs (p ~ 0.12). 


anisotropic emission of thermal photons} or the scattering of subli- 
mating gas molecules'* cause the asteroids to spin faster, to the point 
when gravity and cohesive forces can no longer keep them intact. A 
third possibility is that all asteroids contain volatile elements that, when 
sublimating at a moderate temperature, exert enough pressure on the 
body to blow it up. 

To gain insight into the process leading to asteroid disruption, we 
investigated whether asteroids with different surface properties behave 
differently. For this purpose, we compared the q distributions of low- 
albedo and high-albedo NEOs detected by the Wide-field Infrared 
Survey Explorer (WISE) mission. The Anderson-Darling test!> shows 
that the probability that these two samples come from a common 
parent distribution is only 0.03%, whereas a reasonable agreement is 
found when limiting the analysis to NEOs with q > 0.6 au (Fig. 3). This 
result agrees with the results of an independent analysis? of WISE data, 
which showed that the observed Aten asteroids have, on average, higher 
albedos than Apollo and Amor asteroids. This can be explained if low- 
albedo NEOs disrupt, on average, farther from the Sun than high-albedo 
asteroids of comparable size, implying that they have different physical 
properties that make them more vulnerable to strong solar irradia- 
tion. The fact that the q distribution of low-albedo NEOs appears to 
be steeper than that of high-albedo NEOs at q < 0.6 au supports this 
conclusion: the larger q« is, the steeper is the resulting q distribution 
(Supplementary Fig. 13). A larger average disruption distance may be 
due to a higher volatile budget in low-albedo asteroids, as suggested 
by the composition of the most primitive carbonaceous meteorites 
(usually expected to be related to these bodies) and by the quite common 
presence of hydration bands in their spectra. Thermal cracking is also 
more efficient for carbonaceous meteorites than for ordinary chon- 
drites (the meteorites associated with high-albedo asteroids). We also 
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note that darker NEOs experience somewhat greater heating and may 
therefore be more prone to thermally driven disruption. 

In our model we assume that an instantaneous disruption takes place 
when q < q, for an NEO. We predict that NEO disruptions must take 
place in less than 250 years, the timescale used to record our model 
data. Our results are consistent with the (q, H) distribution of known 
asteroids. Asteroid (394130) 2006 HYs, has the smallest perihelion 
distance, q0.081 au + 17.4R,, among known NEOs with reliable 
estimates for the absolute magnitude. Its absolute magnitude, H = 17.2, 
is in agreement with our assessment of the average disruption distance 
(Fig. 2). Our results are also in agreement with observations of a slow 
erosion of the asteroid (3200) Phaethon"®, which is too large to disrupt 
catastrophically on its current orbit. 

The recorded inclination distribution of test asteroids at the disrup- 
tion epoch overlaps with the observed inclination distribution of 
q< 0.184 au meteor showers identified in data obtained by the 
Canadian Meteor Orbit Radar’”!* (CMOR; Supplementary Fig. 14). 
While covering the same range, the latter distribution is skewed towards 
larger values, which can be understood considering that radar is more 
sensitive to high-speed meteors and hence the orbital distribution is 
biased against low-inclination orbits. Super-catastrophic disruptions 
are consistent with the fact that parent bodies have yet to be detected 
for most meteor streams with small q and small i identified in CMOR 
data. Given the inverse correlation between q, and asteroid diameter 
we predict that the average total mass of meteor streams lacking obvious 
parent bodies should diminish as a function of q as long as q< 0.2 Au 
andi< 40°. 

In the future, a detailed understanding of the circumstances leading 
to disruption of asteroids at different q values may offer insight into 
their bulk composition as well as internal structure. In particular, a 
quantitative assessment of the volatile content for NEOs, and hence of 
their siblings in the main asteroid belt, by mapping out the disruption 
probability as a function of q and source region, would complement 
current approaches, which usually rely on extrapolation from surface 


properties!? and the detection of comet-like activity??7!. 
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Universal resilience patterns in complex networks 


Jianxi Gao!, Baruch Barzel2* & Albert-Ldszlé Barabasi)?:4° 


Resilience, a system’s ability to adjust its activity to retain its basic 
functionality when errors, failures and environmental changes 
occur, is a defining property of many complex systems'. Despite 
widespread consequences for human health”, the economy’ and 
the environment’, events leading to loss of resilience—from 
cascading failures in technological systems” to mass extinctions 
in ecological networks°—are rarely predictable and are often 
irreversible. These limitations are rooted in a theoretical gap: the 
current analytical framework of resilience is designed to treat 
low-dimensional models with a few interacting components’, 
and is unsuitable for multi-dimensional systems consisting of 
a large number of components that interact through a complex 
network. Here we bridge this theoretical gap by developing a set of 
analytical tools with which to identify the natural control and state 
parameters of a multi-dimensional complex system, helping us 
derive effective one-dimensional dynamics that accurately predict 
the system’s resilience. The proposed analytical framework allows 
us systematically to separate the roles of the system’s dynamics 
and topology, collapsing the behaviour of different networks 
onto a single universal resilience function. The analytical results 
unveil the network characteristics that can enhance or diminish 
resilience, offering ways to prevent the collapse of ecological, 
biological or economic systems, and guiding the design of 
technological systems resilient to both internal failures and 
environmental changes. 

The traditional mathematical treatment of resilience used 
from ecology* to engineering® approximates the behaviour of a 
complex system with a one-dimensional (1D) nonlinear dynamic 
equation 


dx 
—_ = 5X 1 
S = f(G.x) (1) 
The functional form of f(3, x) represents the system’s dynamics, and 
the parameter ( captures the changing environmental conditions. The 
system is assumed to be in one of the stable fixed points, xo, of equation 


(1), extracted from? 


f(8,x0) =0 (2) 


of 
Ox | 


<0 


x0 


(3) 


where equation (2) provides the system's steady state and equation (3) 
guarantees its linear stability. The solution of equations (2) and (3) pro- 
vides the resilience function x(3), which represents the possible states 
of the system as a function of 3 (Fig. la-c). The shape of this function 
is uniquely determined by the functional form of f(G, x). In contrast, 
the momentary state of the system is determined by the tunable param- 
eter (. At some critical point (3, the resilience function may feature a 
bifurcation (Fig. 1a) or become non-analytic (Fig. 1b, c), indicating that 


the system loses its resilience by undergoing a sudden transition to a 
different®?, often undesirable, fixed point of equation (1). 

Although it is conceptually powerful, this analytic framework does 
not account for the exceptionally large number of variables that in 
reality control the state of a complex system. Indeed, real systems 
are composed of numerous components linked via a complex set of 
weighted, often directed, interactions!°!', and controlled by not one 
microscopic parameter, but by a large family of parameters, such 
as the weights of all interactions. Hence, instead of a 1D function 
f(G, x), characterized by a single parameter (3, their state should be 
described by a network of coupled nonlinear equations that cap- 
ture the interactions between the system’s many components, and 
account for the complex interplay between the system’s dynamics 
and changes in the underlying network topology®'’. The resulting 
resilience function is therefore a multi-dimensional manifold over 
the complex parameter space characterizing the system (Fig. 1d-f), 
which, using the current tools, cannot be treated analytically. Here 
we overcome these longstanding limitations by developing a general 
network-based theoretical framework that allows us to explore and 
predict the multiple roots and dimensions of resilience, exposing cru- 
cial determinants of resilience loss in complex natural and man-made 
systems. 

Consider a system consisting of N components (nodes) whose activ- 
ities x = (x), ...,xy)! follow the coupled nonlinear equations!”"? 


ls 


N 
dt = F(x;) + >> AyG(x, xj) 


j=l 


(4) 


The first term on the right-hand side of equation (4) describes the 
self-dynamics of each component, while the second term describes 
the interactions between component i and its interacting partners. The 
nonlinear functions F(x;) and G(x; xj) represent the dynamical laws 
that govern the system’s components, while the weighted connectiv- 
ity matrix Aj captures the interactions between the nodes. With an 
appropriate choice of F(x;) and G(x; x;), equation (4) is used to model 
numerous systems known for their resilience, ranging from cellular'* 
to ecological'*!° and social systems’. 

In analogy with the 1D system of equation (1), a transition from a 
desired to an undesired stable fixed point captures the loss of resilience 
in the multi-dimensional system of equation (4). The key difference is 
that in equation (4) resilience loss can be induced by changes in any of 
the N’ parameters of the weighted network Aj, each change capturing 
a different kind of perturbation (Fig. 1g). For instance, the extinction/ 
introduction of species in an ecological system may correspond to the 
removal/addition of one or several nodes”!®. Alternatively, the loss of 
an enzyme catalysing a reaction in a metabolic network”? may corre- 
spond to link removal. Finally, global environmental changes, such 
as increases in ocean acidity or global warming”, may correspond to 
global shifts in the weights of Aj. 

We illustrate the challenges such multi-dimensional systems offer by 
exploring the mutualistic interactions among species in an ecological 


1Center for Complex Network Research, Department of Physics, Northeastern University, Boston, Massachusetts 02115, USA. @Department of Mathematics, Bar-Ilan University, Ramat-Gan 52900, 
Israel. 3Center for Cancer Systems Biology, Dana-Farber Cancer Institute, Harvard University, Boston, Massachusetts 02215, USA. Department of Medicine, Brigham and Women’s Hospital, 
Harvard Medical School, Boston, Massachusetts 02115, USA. 5Center for Network Science, Central European University, Budapest 1051, Hungary. 


*These authors contributed equally to this work. 


18 FEBRUARY 2016 | VOL 530 | NATURE | 307 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Resilience of a single component: A= AB.x) 


Figure 1 | Network resilience. a—c, In 1D systems 
resilience is captured by the resilience function 
x(3), which describes the state(s) of the system as a 
function of the tunable parameter (3. We illustrate 
three typical examples. a, The bifurcating resilience 
function. The system exhibits a single stable fixed 
point for G >, (blue) and two (or more) stable 
fixed points, a desired (blue) and an undesired 
(red) for 3 < 3. b, Resilience function with a first- 
order transition from the desired (blue) state to the 
undesired (red) state. c, Resilience function with a 


g Dynamics 
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network. Here equation (4) tracks the abundance x;(t) of species i, 
following!® 


dx; 
dt 


=8.+x1 ae i|+ Ay ss 


jal “Dy +E; x;+ HjXx; 


The first term on the right hand side of equation (5) accounts for the 
incoming migration of i at a rate B; from neighbouring ecosystems. 
The second term describes logistic growth with the system carrying 
capacity”! K;, and the Allee effect, according to which for low abun- 
dance (x; < C;) the system features negative growth”. The third term 
describes mutualistic interactions, captured by a response function 
that saturates for large x; or x;, indicating that j’s positive contribution 
to x; is bounded. To construct Ajj we used symbiotic interactions, 
such as plant-pollinator relationships, collected for seven ecologi- 
cal systems”? (Supplementary Table 1), describing networks ranging 
from N=10 to N= 1,429 nodes (Fig. 2a, b). We numerically solved 
equation (5), and tested its resilience under three realistic pertur- 
bations (Fig. 2c): first, we randomly removed a fraction f, of nodes, 
capturing plant extinctions; second, we removed a random frac- 
tion f of pollinators, thus perturbing the mutualistic link weights; 
and finally, we randomly rescaled all the weights Aj, reducing their 
strength on average by a fraction f,, to mimic global environmental 
changes. 

We find that for small perturbations the system maintains its resil- 
ience: its only stable fixed point is x", in which the average abun- 
dance (x) is high. However, when the perturbation exceeds a certain 
threshold a bifurcation occurs, resulting in two stable fixed points: the 
desired x# and an undesired low-abundance state x" (Fig. 2d). Under 
these conditions the system loses its resilience, potentially transi- 
tioning to the undesired x". The precise bifurcation point marking 
this loss of resilience is, however, highly unpredictable. For instance, 
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stable solution for 3 < G3, and no solution above (3, 
resulting in an uncontrolled divergent or chaotic 
behaviour. d-f, In a multi-dimensional system, the 
single parameter (3 is replaced by the complex 
weighted network A ij, Whose characteristics 
depend on both environmental conditions and 

the specific pairwise interaction strengths. 
Consequently, the resilience function, now 
capturing the behaviour of the vector state x(Ajj), 
is a multi-dimensional manifold, prohibiting 
analytical treatment. The 3D plots show the 
resilience plane for a four-node system, displaying 
3 0 x(A23, Aq) for fixed Aj) and A34. The full 

A description of an N-dimensional system requires 
an N?-dimensional plane, tracking the state of the 
system as a function of all Aj. g, After applying our 
formalism, the multi-dimensional manifold shown 
in d-f collapses into a 1D resilience function in 
(-space (blue and red solid lines). The structure of 
this function, and hence its critical points vp 
(dashed lines) is fully determined by the system's 
dynamics F(x;) and G(x;, x;): ecological, regulatory, 
power transmission and so on (left). The network 
topology Aj (right) determines (eg through 
equation (8), and hence the specific state of the 
system (brown dot) along the resilience function. 


Net1 displays a different resilience pattern for node removal (Fig. 2d), 
link removal (Fig. 2e) or global weight changes (Fig. 2f), indicating that 
different forms of perturbations lead to different outcomes within the 
same system. Such diversity is also observed for similar perturbations 
across different systems. For example, while we need to remove at least 
35% of the pollinators for Net] to lose its resilience (Fig. 2e), Net5 
remains resilient even after 80% of its pollinators are deleted (Fig. 2h). 
Finally, even the sequence in which the perturbation is applied makes 
an important difference: Net1 can lose its resilience anywhere between 
the removal of 30% to 70% of its nodes (Fig. 2d), depending on the 
specific trial. 

Altogether, we analysed fourteen experimentally mapped networks 
(Supplementary Table 1), finding that the resilience transition depends 
on the network topology, the form and the nature of the perturbation 
applied and the specific realization (Fig. 2d-l, and Supplementary 
Figs 2-4), exposing severe limits to our ability to predict the network 
resilience. We now show that this seemingly unpredictable behaviour 
can be systematically treated by focusing on the system's natural state 
and control variables. In a network environment, the state of each 
node is affected by the state of its immediate neighbours. Therefore, 
we characterize the effective state of the system using the average 
nearest-neighbour activity 


(6) 


wheres — (67",.. SN" ' is the vector of outgoing weighted degrees 


and s"= (Siac sis S SN)! is the vector of incoming weighted degrees, 
(sx) = (1/N) 1 50%"x;, (s) = (58) = (so) is the average weighted 
degree, and 1 is the unit vector 1= (1,...,1)'. As we show in 
Supplementary Information section I, if Aj has not much degree 
correlation, the variable xg in equation (6) allows us to reduce the 
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Figure 2 | Resilience in ecological networks. 
a, The bipartite network Mj consists of nodes 
representing symbiotically connected species, 
such as plants and pollinators or fish and anemone. 
b, From Mj, we construct two mutualistic networks 
(Supplementary Information section II.B) by linking 
pairs of plants that share mutual pollinators (Ajj), or 
pollinators that share mutual plants (Bj). c, We tested 
the resilience of fourteen mutualistic networks 
(equation (5); Bs = B=0.1, Cc =C=1, Ki = K=5, 
Weight loss D,;=D=5, E;= E=0.9 and Hj= H=0.1) against: 

} | (1) extinction of a fraction f, of plants; (2) extinction 
9 


of a fraction fi of pollinators, impacting the relevant 
plant link weights; (3) decreasing all weights on 


average to a fraction fy of their original value, 
simulating a global change in the environmental 
conditions, for example, varying temperature. d, The 
average abundance of fish in Net] versus f, across 100 
realizations (we highlight one of these realizations in 
black). At a critical fraction f, the system undergoes a 
bifurcation, where in addition to the high-abundance 
state (x#) an undesired low-abundance state emerges 


(x"). However, owing to the multi-dimensionality of 
Ap each realization is microscopically distinct, and as 
a result the bifurcation point is unpredictable, ranging 
from fn = 0.3 to fy =0.7 across different realizations. 

e, f, Similar diversity characterizes the system's 
response to link perturbation fj (e) and global 
perturbations fy. (f). g-I, The difficulty in 


predictability is also observed in the larger networks 
Net5 and Net7. Additional results appear in 
Supplementary Figs 2-4. m, Our formalism predicts 
that in 6-space the resilience function takes a 
universal form, similar to that shown in Fig. la, with 
bifurcation at vr = 6.97, independent of Aj. n, All 


Universal resilience function 


data in d-] and in Supplementary Figs 2-4, 

0.5 1.0 comprising 28 highly diverse networks, collapses 

Sw onto the universal resilience function predicted in m, 
indicating that regardless of the network structure and 
the form of perturbation, the state of the system is 
fully determined by (er (equation (8)). 
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multi-dimensional equation (4) to an effective 1D equation, written in 
terms of Xer¢ 


dx. 
A = F(xete) + BettG (xetts Xett) (7) 


where 


1 Asi® ‘gout gin 
a ua (8) 
1'Al (s) 


averages over the product of the outgoing and incoming degrees 
of all nodes. This reduction maps the multi-dimensional complex 
system (4) into an effective 1D equation of the form of equation (1), 
where 


F (Betts Xett) = F (Xese) + BetrG (ete, Xett ) (9) 


We predict, therefore, that the N* parameters of the microscopic 
description Aj collapse into a single macroscopic resilience parameter 
Bere (see equation (8)), so that regardless of the microscopic details 


10° ps, 10? 
B eff 

of the perturbation (node/link removal, weight reduction or any 
combination thereof), its impact on the state of the system is fully 
accounted for by the corresponding changes in (ef. This implies that 
the rather diverse and unpredictable behaviours observed in Fig. 2 
are, in fact, drawn from a single universal resilience function, inde- 
pendent of the network topology Aj, and uniquely determined by the 
system's dynamics F(x;) and G(x; x;). The network Aj, which is fully 
condensed into the single macroscopic parameter (4, determines 
only the specific state of the system along this function. Such mapping 
of equation (4) to the 1D equation (7) allows us to take advantage 
of the theoretical tools developed for low-dimensional systems and 
apply them to a broad range of complex systems. 

To illustrate the power of our formalism we apply it to the mutualistic 
networks of Fig. 2. Reducing the multi-dimensional equation (5) to 
the form of equation (9) we arrive at the 1D equation (Supplementary 
Information section II) 


Xoet (10) 


AX eff —B4 x1 Xeff |= 
K + (E + H) xt 


1]/+ Ge 
7 ] Best 


18 FEBRUARY 2016 | VOL 530 | NATURE | 309 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Node loss b 


Link loss c 


S. cerevisiae ™ 
x) 


[3 


E. coli 
9) 
o-t NW FP OT OHNWAUHMON 


0 02 04 06 08 100 02 04 06 0.8 1.0 0 


g Universal resilience function h 


Cell death Resilient state 


X eft 


Oo - NM WO Fk HT DN 


Using equations (2) and (3), equation (10) predicts a bifurcating resil- 
ience function of the form of Fig. 1a, and a transition from a resilient 
state with a single stable fixed point, x! to anon-resilient state in which 
both x# and x! are stable. The critical point of this bifurcation is pre- 
dicted to be Ge = 6.97, a value fully determined by the dynamics, inde- 
pendent of the network topology Aj (Fig. 2m and Supplementary 
Information section II). 

Our formalism predicts that the diversity observed in Fig. 2d-l and 
in Supplementary Figs 2-4, is, in fact, driven by the universal curve 
of Fig. 2m. This universality can be exposed by transitioning to the 
natural parameter space of xeff (see equation (6)) and (err (see equation 
(8)). Hence we re-plotted all the data of Fig. 2d-1 (and Supplementary 
Figs 2-4, 28 diverse networks in total), in this effective 3-space, finding 
that, as predicted, all data points collapse into a single universal curve, 
regardless of the size and the topology of the specific ecological net- 
work or the nature of the applied perturbation (Fig. 2n). This collapse 
indicates that our analytically predicted resilience function exposes a 
universality sustained across networks of different size, density, degree 
and weight distributions. Additional extensive testing of this universal- 
ity appears in Supplementary Information section V. 

In summary, the resilience pattern of a complex system is effectively 
unpredictable in the natural (x, Aj) state parameter space. Once, how- 
ever, we map the system into (3-space we can accurately predict the 
system’s response to diverse perturbations and correctly identify the 
critical points where the system loses its resilience. 

Next we explore the resilience of gene regulatory networks governed 
by the Michaelis-Menten equation'* 


: N xh 
oY Saas Ai (11) 
dt j=l xjrl 


The first term on the right-hand side of equation (11) describes degra- 
dation (f= 1) or dimerization (f= 2); the second term captures genetic 
activation, where the Hill coefficient h describes the level of cooper- 
ation in gene regulation’. Applying this model with B= 1, f=1 and 
h=1 to the transcription networks of Saccharomyces cerevisiae” and 
Escherichia coli*, we find that under sufficiently large perturbations 
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Figure 3 | Resilience in gene regulatory networks. 

We ran Michaelis-Menten dynamics (equation (11)) on 
the transcription regulatory networks of S. cerevisiae™* 
and E. coli** (Supplementary Table 2) to model the 
dynamics of genetic regulation, providing the average 
activity (x) of all genes. a, By removing a sufficiently 
high fraction f, of nodes the system undergoes a 
transition from x" > 0 (alive) to x’ =0 (cell death). 

The transition point occurs at a different value of f, in 
each realization. b, c, Similar results are found for link 
perturbations and global weight changes. d-f, The same 
diversity is observed for the E. coli network as well. g, Our 
formalism predicts that the behaviour of gene regulation 
is captured by a universal resilience function of the form 
of Fig. 1b, with a single first-order transition at G°,,=2. 
0.8 1.0 This function is determined by the regulatory dynamics 
fu of equation (11), and is independent of the network 
structure or the nature of its perturbations. h, The results 
of a-f shown in (-space. Regardless of the system’s 
microscopic details, all observed data points, taken 

from S. cerevisiae or E. coli, induced by node 
perturbations (red), link perturbations (green) or weight 
changes (blue), collapse onto the same curve, well 
approximated by the analytically predicted universal 
resilience function (solid line). 


\\ 


the cell undergoes a transition from a resilient state ((x) > 0) to cell 
death ((x) =0, Fig. 3a-f). Once again, resilience loss strongly depends 
on the nature of the perturbation (gene knockout, suppression of 
regulatory interactions, environmental change), as well as on the dif- 
ferences between the wiring diagrams of S. cerevisiae versus E. coli 
(Supplementary Table 2). Rewriting equation (11) in the reduced form 
of equation (7), we find that regulatory dynamics follow the universal 
resilience function shown in Fig. 1b, featuring a single first-order tran- 
sition from the active state to cell death at (Supplementary Information 
section II) 


Cg =— a 


h-f\ f 


Indeed, in 3-space, all trajectories of Fig. 3a-f collapse onto the ana- 
lytically derived resilience function, as predicted by our formalism 
(Fig. 3h). In Supplementary Information section IV we develop another 
application area, demonstrating how to apply our formalism to predict 
the resilience of energy supply in the power grid. 

Although the resilience function is uniquely determined by the 
dynamical functions F(x;) and G(x;, x;), the actual position of the 
system along this curve, capturing its momentary state, is deter- 
mined by the weighted network topology Aj, as expressed through 
Gerg in equation (8). This prompts us to ask what aspects of the 
network topology determine a system's resilience. We therefore rewrite 
Berg as 


f 
al 


Best = (s) + SH (13) 
where (s), S and H represent three characteristics of Aj. The depend- 
ence on the network density (s) indicates that denser networks have a 
larger Gere (Fig. 4a). The heterogeneity in the weighted degrees s and 
s° is captured by H = oinJout /(s), where a7, and o2,,, are the variance 
of the marginal probability density functions P(s'") and P(s™') respec- 
tively (Fig. 4b). Finally, the symmetry between s and s™ is captured 
by S= ((sinst) — (sim) (s%*)) /(cindout)» the in-out weighted-degree 
correlation coefficient. This term, —1 < S< 1, is positive when nodes 
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Figure 4 | The impact of Aj on resilience. The topological characteristics 
that affect a system’s resilience through (er (equation (13)) are: a, the 
network density (s); b, the heterogeneity in degrees or link weights H; and 
c, the symmetry S, capturing the correlations between a node’s in and out 
degrees. d, Phase diagram for mutualistic dynamics in the (s)-H plane. 

In the resilient phase, the system has a single stable fixed point x4) in the 
non-resilient phase the undesired x" is also stable. For this dynamics the 
greater Ber is (square size) the deeper the system is in the resilient phase. 

e, The average state of the system (x) versus the average reduction in the 
link weights. For Net12, the most heterogeneous of the fourteen 
mutualistic networks, we observe an extreme degree of resilience, avoiding 
collapse up to fy = 97% (blue triangles). A homogeneous network, with the 


with large si" tend also to have a large s*. An undirected network 


Ajj=Ajiisa perfectly symmetric network, as s" = s°"", hence S= 1; an 
asymmetric network is one where nodes with a large in-degree tend to 
have a small out-degree, in which case S< 0, tending towards —1 
(Fig. 4c). 

Equation (13) helps us to identify the network characteristics that 
can enhance or weaken a system’s resilience. Consider for example the 
resilience of the ecosystem described in equation (5). The structure of 
the resilience function (Fig. 2m) indicates that the greater is Go, the 
more resilient is the system, enduring larger perturbations before reach- 
ing the bifurcation at (5, and risking a transition to the undesired x". 
Since mutualistic networks are symmetric we have S=1 and 
H = indo /(s) = 07/(s), obtaining Fef=(s) + 07/(s). Hence, equation 
(13) increases both with density (s) and with heterogeneity 1 giving 
rise to a 2D phase space: a resilient phase above the phase boundary 
(s) +H = Gey, and a non-resilient phase below it. In Fig. 4d we show 
all fourteen mutualistic networks and their location on the 
(s)-H phase diagram, characterizing the source of each network's resil- 
ience. For instance, Net11 and Net12 have comparable (.¢ values and 
hence comparable levels of resilience, indicating that both can with- 
stand comparable levels of perturbation before crossing the bifurcation 
at Jp. However, Fig. 4d shows that while the source of Net11’s resil- 
ience is its high density (s), the source of Net12’s resilience is its high 
heterogeneity 1. To test this we constructed two homogeneous 
networks with the same densities as Net11 and Net12, but with H =0. 
As expected for Net12, reducing heterogeneity greatly decreased resil- 
ience (by ~30%), while for Net11, whose source of resilience is density 
rather than heterogeneity, eliminating 7 had only a negligible impact 
(Fig. 4d, e). 

In regulatory dynamics the phase diagram has two domains, cor- 
responding to an active phase and cell death. Here resilience increases 


same density and H = 0 loses resilience at f,, = 66% (red circles); hence H 
is the source of Net12’s exceptional resilience. As indicated in d, Net11’s 
resilience is rooted in its high density (s). Indeed, we find that for Net11, 
both the original (blue triangles) and the homogeneous (red circles) 
networks feature comparable levels of resilience, indicating that H has a 
marginal contribution. f, The phase diagram for directed transcription 
regulatory networks is 3D—(s), # and S—with the first-order transition 
from a living cell to cell death occurring at (s) + SH = 2. The position of 
the S. cerevisiae and the E. coli networks is also shown (triangles). g, For 
both organisms S <0, and hence heterogeneity decreases their resilience: 
indeed, the homogeneous networks (red circles) withstand larger 
perturbations than the original networks (blue triangles). 


with G4, as the larger is GO. the deeper is the system into the active 
state, and farther from the critical transition at 6°, (equation (12)). 
Since Aj is directed, 1 can both increase or decrease Je, depending 
on the symmetry S in equation (13). Consequently, resilience is 
governed by three topological characteristics, where dense, symmet- 
ric and heterogeneous networks are most resilient (large Ger), and 
sparse, antisymmetric and heterogeneous networks are least resilient 
(small (Ger). The phase diagram in the (s)-H-S space is shown 
in Fig. 4f, and the transition between states occurs along the 
plane (s) + SH = (5. For the two regulatory networks we measured 
S=-—0.083 (for S. cerevisiae) and S=—0.2464 (for E. coli), both 
negative. Hence here 1 has a negative contribution to resilience and 
a homogeneous network with H = 0 would, in fact, be more resilient. 
To test this prediction we compared the resilience function of the 
empirical networks (E. coli and S. cerevisiae) with that of the equiva- 
lent homogeneous networks in which (s) is preserved and H =0 
(Fig. 4g). Indeed, we find that eliminating heterogeneity increases the 
systems resilience. 

Complex systems are characterized by an inherently multi- 
dimensional parameter space, giving rise to diverse and unpredictable 
behaviour. Here, by reverting to the natural parameter space (3-space) 
we exposed the hidden universal patterns of network resilience. The 
origin of this universality is in the separation of the system's dynamics 
and topology. Indeed, in most systems the intrinsic behaviour of the 
components and the nature of the interactions between them are invar- 
iable to perturbations’. Perturbations only affect the structure of the 
network, Aj, determining who interacts with whom and how strongly. 
Our formalism reduces Aj into an effective 1D system, showing that 
regardless of the specific topology and weights, or the form of pertur- 
bation, the patterns of resilience depend only on the system's intrinsic 
dynamics. The role of the network topology is fully captured by the 
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1D Ger, predicting that density, heterogeneity and symmetry are the 
three key structural factors affecting a system’s resilience. They do not 
alter the critical points, but instead push a system far away from these 
critical points, helping the system to sustain large perturbations. This 
separation of structure and dynamics provides us with testable predic- 
tions for the system's response to different perturbations. It also suggests 
potential intervention strategies to avoid the loss of resilience?*** 


, Or 
design principles for optimal”? resilient systems that can successfully 
cope with perturbations*°. 


Received 13 July; accepted 14 December 2015. 


1. Cohen, R., Erez, K., Ben-Avraham, D. & Havlin, S. Resilience of the internet to 
random breakdown. Phys. Rev. Lett. 85, 4626-4628 (2000). 

2. Venegas, J. G. et al. Self-organized patchiness in asthma as a prelude to 
catastrophic shifts. Nature 434, 777-782 (2005). 

3.  Perrings, C. Resilience in the dynamics of economy-environment systems. 
Environ. Resour. Econ. 11, 503-520 (1998). 

4. May, R. M. Thresholds and breakpoints in ecosystems with a multiplicity of 
stable states. Nature 269, 471-477 (1977). 

5. Lyapunov, A. M. The general problem of the stability of motion. /nt. J. Control 
55, 531-534 (1992). 

6. Barzel, B. & Biham, O. Quantifying the connectivity of a network: the network 
correlation function method. Phys. Rev. E 80, 046104 (2009). 

7. Sole, R. V. & Montoya, M. Complexity and fragility in ecological networks. Proc. 
R. Soc. Lond. B 268, 2039-2045 (2001). 

8. May, R. M., Levin, S. A. & Sugihara, G. Complex systems: Ecology for bankers. 
Nature 451, 893-895 (2008). 

9. Scheffer, M. et a/. Anticipating critical transitions. Science 338, 344-348 
(2012). 

10. Albert, R. & Barabasi, A.-L. Statistical mechanics of complex networks. Rev. 
Mod. Phys. 74, 47-97 (2002). 

11. Ben-Naim, E., Frauenfelder, H. & Toroczkai, Z. Complex Networks Vol. 650 
(Springer Science & Business Media, 2004). 

12. Barzel, B. & Barabasi, A.-L. Universality in network dynamics. Nature Phys. 9, 
673-681 (2013). 

13. Barzel, B., Liu, ¥-Y. & Barabasi, A-L. Constructing minimal models for complex 
system dynamics. Nature Commun. 6, 7186 (2015). 

14. Alon, U. An Introduction to Systems Biology: Design Principles of Biological 
Circuits (CRC Press, 2006). 

15. Berlow, E. L. et al. Simple prediction of interaction strengths in complex food 
webs. Proc. Nat! Acad. Sci. USA 106, 187-191 (2009). 

16. Holland, J. N., DeAngelis, D. L. & Bronstein, J. L. Population dynamics and 
mutualism: functional responses of benefits and costs. Am. Nat. 159, 231-244 
(2002). 

17. Pastor-Satorras, R. & Vespignani, A. Epidemic spreading in scale-free networks. 
Phys. Rev. Lett. 86, 3200 (2001). 


312 | NATURE | VOL 530 | 18 FEBRUARY 2016 


18. Dunne, J. A. Williams, R. J. & Martinez, N. D. Network structure and biodiversity 
oss in food webs: robustness increases with connectance. Ecol. Lett. 5, 
558-567 (2002). 

19. Wilhelm, T., Behre, J. & Schuster, S. Analysis of structural robustness of 

metabolic networks. Syst. Biol. 1, 114-120 (2004). 

20. McCulloch, M., Falter, J., Trotter, J. & Montagna, P. Coral resilience to 

ocean acidification and global warming through ph up-regulation. Nature 

Clim. Change 2, 623-627 (2012). 

21. Hui, C. Carrying capacity, population equilibrium, and environment’s maximal 

oad. Ecol. Modell. 192, 317-320 (2006). 

22. Allee, W. C. et al. Principles of Animal Ecology Edn 1 (WB Saundere, 1949). 

23. Interaction Web Database http://www.nceas.ucsb.edu/interactionweb/ 

resources.html#plant_ant (accessed 30 September 2010). 

24. Balaji, S., Madan Babu, M., lyer, L., Luscombe, N. & Aravind, L. Principles of 
combinatorial regulation in the transcriptional regulatory network of yeast. 

J. Mol. Biol. 360, 213-227 (2006). 

25. Gama-Castro, S. et al. Regulondb (version 6.0): gene regulation model of 
Escherichia coli k-12 beyond transcription, active (experimental) annotated 
promoters and textpresso navigation. Nucleic Acids Res. 36, D120-D124 
(2008). 

26. Nepusz, T. & Vicsek, T. Controlling edge dynamics in complex networks. Nature 
Phys. 8, 568-573 (2012). 

27. Cornelius, S. P., Kath, W. L. & Motter, A. E. Realistic control of network 
dynamics. Nature Commun. 4, 1942 (2013). 

28. Majdandzic, A. et a/. Spontaneous recovery in dynamical networks. Nature 
Phys. 10, 34-38 (2013). 

29. Helbing, D. & Vicsek, T. Optimal self-organization. New J. Phys. 1, 13 
(1999). 

30. Cohen, R., Erez, K., Ben-Avraham, D. & Havlin, S. Breakdown of the internet 
under intentional attack. Phys. Rev. Lett. 86, 3682-3685 (2001). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements We thank A. Mohan, S. E. Flynn and A. R. Ganguly for 
discussions. This work was supported by an Army Research Laboratories 
Network Science Collaborative Technology Alliance grant (ARL NS-CTA 
W911NF-09-2-0053), by The John Templeton Foundation: Mathematical and 
Physical Sciences (grant number PFI-777), by The Defense Threat Reduction 
Agency (basic research grant number HDTRA1-10-1-0100) and by the European 
Commission (grant numbers FP7317532 (MULTIPLEX) and 641191 (CIMPLEX)). 


Author Contributions All authors designed and did the research. J.G. and B.B. 
did the analytical calculations. J.G. analysed the empirical data and did the 
numerical calculations. A.-L.B. and B.B. were the lead writers of the manuscript. 


Author Information All code for the reproduction of the reported results can 
be downloaded from https://github.com/jianxigao/NuRsE. Reprints and 
permissions information is available at www.nature.com/reprints. The authors 
declare no competing financial interests. Readers are welcome to comment 
on the online version of the paper. Correspondence and requests for materials 
should be addressed to A.-L.B. (alb@neu.edu). 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


doi:10.1038/nature16536 


Non-classical correlations between single photons 
and phonons from a mechanical oscillator 


Ralf Riedinger'*, Sungkun Hong!*, Richard A. Norte?, Joshua A. Slater!, Juying Shang*, Alexander G. Krause’, Vikas Anant?, 


Markus Aspelmeyer! & Simon Groblacher? 


Interfacing a single photon with another quantum system is a 
key capability in modern quantum information science. It allows 
quantum states of matter, such as spin states of atoms!”, atomic 
ensembles** or solids®, to be prepared and manipulated by 
photon counting and, in particular, to be distributed over long 
distances. Such light-matter interfaces have become crucial to 
fundamental tests of quantum physics® and realizations of quantum 
networks’. Here we report non-classical correlations between 
single photons and phonons—the quanta of mechanical motion— 
from a nanomechanical resonator. We implement a full quantum 
protocol involving initialization of the resonator in its quantum 
ground state of motion and subsequent generation and read-out 
of correlated photon-phonon pairs. The observed violation of a 
Cauchy-Schwarz inequality is clear evidence for the non-classical 
nature of the mechanical state generated. Our results demonstrate 
the availability of on-chip solid-state mechanical resonators as 
light-matter quantum interfaces. The performance we achieved 
will enable studies of macroscopic quantum phenomena’ as well as 
applications in quantum communication’, as quantum memories’® 
and as quantum transducers!!”, 

Over the past few years, nanomechanical devices have been discussed 
as possible building blocks for quantum information architectures’. 
Their unique feature is that they combine an engineerable solid-state 
platform on the nanoscale with the possibility to coherently interact 
with a variety of physical quantum systems including electronic or 
nuclear spins, single charges, and photons'*'». This feature enables 
mechanics-based hybrid quantum systems that interconnect different, 
independent physical qubits through mechanical modes. 

A successful implementation of such quantum transducers requires 
the ability to create and control quantum states of mechanical motion. 
The first step—the initialization of micro- and nanomechanical systems 
in their quantum ground state of motion—has been realized in vari- 
ous mechanical systems either through direct cryogenic cooling'*”” or 
laser cooling using microwave’® and optical cavity fields’’. Further pro- 
gress in quantum state control has mainly been limited to the domain 
of electromechanical devices, in which mechanical motion couples 
to superconducting circuits in the form of qubits and microwave 
cavities'®. Recent achievements include single-phonon control of a 
micromechanical resonator by a superconducting flux qubit'®, the 
generation of quantum entanglement between quadratures of a micro- 
wave cavity field and micromechanical motion”, and the preparation 
of quantum squeezed micromechanical states”!. 

Interfacing mechanics with optical photons in the quantum regime 
is highly desirable because it adds important features such as the abil- 
ity to transfer mechanical excitations over long distances”. In addi- 
tion, the available toolbox of single-photon generation and detection 
allows for remote quantum state control’. However, micro- and nano- 
mechanical quantum control through single optical photons has not 


yet been demonstrated. One of the outstanding challenges is to achieve 
single-particle coupling rates that are sufficiently large to alleviate 
effects of optical and mechanical decoherence in the system, that is, 
single-photon strong co-operativity. Some of the largest optomechan- 
ical couplings have been reported in nanomechanical photonic crystal 
cavities*’, but are still two orders of magnitude short of that regime. 
Although low coupling rates can be overcome in principle by a strong 
and detuned coherent drive field'°, such measures typically result in 
unwanted heating of the mechanical device (see Methods). 

Here we take a different approach that allows us to circumvent these 
problems and to realize quantum control of single phonons through 
single optical photons. We use a probabilistic scheme based on the well- 
known DLCZ protocol (Duan, Lukin, Cirac and Zoller)**, which, in its 
original form, uses Raman scattering for efficient generation and read- 
out of collective spin states of atomic ensembles. In essence, the scheme 
generates entanglement through single-photon interference and 
post-selection, which does not require strong coupling”. In the context 
of mechanical quanta, this protocol has been used in an experiment to 
entangle high-frequency (40 THz) optical phonons of two bulk dia- 
mond lattices”°. However, the small interaction and coherence times of 
such phonons are incompatible with their use in quantum transduction 
and storage, and so it is necessary to take this approach to the level of 
chip-scale optomechanical systems. In addition, we minimize absorp- 
tion heating by using short optical pulses in a cryogenic environment”. 
The combination of these techniques allows us to overcome the previ- 
ous limitations and realize a photon-phonon quantum interface. 

Our experiment complements previous work on single- and 
two-mode (opto-)mechanical squeezing in microwave circuits???". 
Although these experiments were based on the same underlying inter- 
actions, they involved homodyne or heterodyne detection of light to 
access continuous-variable degrees of freedom of a quantum state— 
specifically, quadrature fluctuations in the mechanical and optical 
canonical variables. In contrast, the DLCZ scheme uses photon count- 
ing, which allows access to discrete quantum variables—here, in form of 
energy eigenstates (phonons) of the mechanical motion—and thereby 
enables realistic architectures for entanglement distribution and quan- 
tum networking’. 

The mechanical system studied here is a micro-fabricated silicon 
photonic crystal nanobeam structure (Fig. 1a). Such optomechanical 
crystals co-localize optical and mechanical modes and couple them via 
a combination of radiation pressure and photostriction'. Our device 
exhibits an optical cavity resonance at wavelength A-= 1,556nm anda 
mechanical breathing mode at frequency wy,/(27) = 5.3 GHz. The cav- 
ity decay rate (full-width at half-maximum, FWHM) is &,/(27) 
= 1.3 GHz and the mechanical quality factor at cryogenic temperature 
is Qm= 1.1 x 10° (see Methods). Pulsed optical driving at laser fre- 
quency wy = Ww. + Wp (in which w,.=27¢/ A, is the cavity frequency and 
cis the vacuum speed of light) allows to realize two different types of 
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Figure 1 | Generation and read-out of photon-phonon pairs. 

a, Schematic of the experiment. Two independent lasers (stabilized 

to a wave-meter) are used to generate a sequence of ‘write’ and ‘read’ 
pulses with tunable time delay dt. They are sent through a circulator and 
drive a nanomechanical photonic crystal cavity (a scanning electron 
microscope image of which is shown in the inset) that is mounted inside 
a dilution refrigerator at a base temperature of 25 mK, which prepares 
the device in its quantum ground state of motion. For each pulse, Stokes 
and anti-Stokes Raman scattering creates single photons (green dots) 
from the write (W) and the read (R) pulse, respectively, that are emitted 
at a frequency w.. The detuned pump fields are strongly suppressed by 
optical filtering and only the Raman scattered photons are measured by 
two superconducting nanowire single-photon detectors (SNSPDs) in the 
output ports of a 50/50 beam-splitter. The time of each photon detection 
event is recorded and is then correlated in post-processing to obtain both 
auto- and cross-correlations of the emitted photons. A more detailed 
explanation of the experimental set-up is provided in Methods. b, Pulsed 
optomechanical interactions in frequency space. A blue-detuned write 
pulse realizes a two-mode squeezing interaction (blue and green pulses; 
see text). Cavity-enhanced Stokes Raman scattering generates a single 
phonon, stored as excitation on the mechanical resonator, and a single 
(W) photon, which is emitted from the cavity on resonance (upper 
panel). Reading out of the phonon utilizes a red-detuned read pulse, 
which swaps the mechanical excitation onto the optical cavity field, 
hence creating a single (R) photon (lower panel). The insets depict the 
relevant energy level diagrams for the two processes, reminiscent of the 
A-schemes in atomic Raman scattering. The grey bars indicate the energy 
levels that are not involved in the depicted process (Stokes or 
anti-Stokes), but in the other one. 
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Figure 2 | Mechanical quantum ground state preparation. a, Principle 
of sideband thermometry. The finite element method simulation depicted 
in the main panel shows the structure of the mechanical breathing mode 
under investigation. The upper (lower) inset shows the energy level 
scheme in case of blue- (red-) detuned pumping and the resultant cavity- 
enhanced Stokes (anti-Stokes) scattering. The corresponding scattering 
rates I, and J are proportional to thermal occupation of the mechanics 
Mp and np, + 1, respectively, and hence show a strong asymmetry when the 
mechanics are close to the quantum ground state. b, Sideband asymmetry. 
The optomechanical device is pumped with a sequence of alternating 
blue- and red-detuned optical pulses at frequency w, + Wm (optical 

energy per pulse Eo, = 33 fJ; FWHM of 28.4 ns; 500-j1s separation of 
pulse sequences). Shown are the count rates recorded by the SNSPDs as a 
function of the arrival time of the scattered photons (blue, blue-detuned 
pulse; red, red-detuned pulse). This data has been corrected for leakage 
of pump photons through the optical filters, which was independently 
measured and subtracted from our data (see Methods). The inset shows a 
histogram of the total counts that are obtained when averaging over a 
20-ns window centred on the peak (within the dashed lines). The 
pronounced asymmetry in the rates (of more than a factor of 40) 
corresponds to a thermal occupancy of nin = Ip/(L — PR) = 0.025 + 0.002 
and to a mode temperature of 69 mK. 


interactions on the basis of cavity-enhanced Stokes (++) and anti-Stokes 
(—) Raman scattering (Fig. 1b). A blue-detuned pulse (wy = w.+ wm) 
results in two- ions squeezing we ee pei Hamiltonian 
Hims & hig (ata a+ GmAo) in which a! Ain *) and a! av are the creation (anni- 
hilation) operators of the mechanical and optical mode, respectively, 
go is the effective optomechanical coupling rate (here, go/(27) = 
825 kHz; see Methods) and h is the reduced Planck constant. This inter- 
action generates photon-phonon pairs in close analogy to the 
photon-photon pairs generated in parametric down-conversion”’. A 
red-detuned pulse (wy, =, — wm) allows read-out of the mechanical 
state through the optomechanical beam-splitter interaction 
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Figure 3 | Non-classical photon-phonon correlations. a, Driving pulse 
sequence. A pair of one write (blue) and one read (red) pulse is sent to the 
device every 1 ms. The long idle phase between pulse pairs ensures the 
ground-state initialization by cryogenic cooling. Each pulse sequence is 
labelled with a number (n). The read pulse is delayed by 5¢ with respect to 
the write pulse, and only the first 55 ns, equivalent to a read-pulse power 
of about 50 fJ, are used for the data evaluation. This reduces the influence 
of absorption heating while maintaining reasonable state swap fidelity. 
b, Violating a Cauchy-Schwarz inequality. Shown is the cross-correlation 
(green bars) between the mechanical (read pulse) and optical state (write 
pulse) for t= 100 ns, as well as the classical (Cauchy-Schwarz) bound 
obtained from the autocorrelations at An = 0 (grey horizontal line, 
shading indicates a 68% confidence interval; see text). For photon-phonon 
pairs that emerge from different pulse sequences (An = 0) the Cauchy- 
Schwarz inequality is fulfilled, (g?)(An ~ 0, 100 ns)) = 1.04+0.04, 
consistent with statistical independence. For pulses from the same pair, 
the cross-correlation g® (0, 100 ns) clearly exceeds the classical bound. 
go can be interpreted as the ratio of heralded phonons n, to unheralded 


(thermal) phonons ny) at the time of the read pulse. c, Storage of non- 
classical correlations. Shown is the dependence of the cross-correlation 

on the time delay 6¢ between the write and read pulses. For each data point, 
the classical bound is measured independently through the normalized 
autocorrelation functions of the write (W) and read (R) photons. For 
increasing 5t, the photon-phonon cross-correlations decrease, but stay 
above the classical limit even beyond 1)1s. The main contribution to 

the loss of correlation is heating by absorption of the write pulse 

(see Methods). All error bars represent a 68% confidence interval. 


ps « hig, (4 ma. +41 4,), in which an anti-Stokes scattering event real- 
izes a state swap between the mechanical and optical cavity mode. 
Our protocol consists of three distinctive steps. First, we initialize the 
mechanical system in its quantum ground state of motion by cryogenic 
cooling. Second, a short blue pulse creates a photon-phonon pair 
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and leaves the originally empty mechanical and optical modes 
|On) and |0,) at frequencies w, and w,, respectively, in the state 
|®ym) = |00) + ap |11) + p|22) + O(p?/? ). Here p is the probability for 
a single Stokes scattering event to take place. Residual heating through 
optical absorption introduces additional noise to the state (see 
Methods). Finally, a strong red pulse is used to read out the phonon 
state via emission of an anti-Stokes scattered photon”®. We confirm the 
non-classical photon-phonon correlations on the basis of an observed 
violation of a Cauchy—Schwarz inequality for the cross-correlation of 
the coincidence measurements between the Stokes and anti-Stokes 
photons’. 

Precooling of the nanomechanical device is performed using a dilu- 
tion refrigerator that operates at a base temperature of approximately 
25 mK. If the mechanical system is in its quantum ground state of 
motion, then anti-Stokes processes cannot occur because no additional 
phonons can be extracted to support the scattering. This is in contrast 
to Stokes processes, which deposit mechanical energy and hence can 
always occur. As a consequence, the asymmetry in the scattering rates 
of these two processes is a direct measurement of the mean thermal 
phonon occupancy mp. Using such photon-counting based sideband 
thermometry”’, we find ni, < 0.025 (see Fig. 2). 

We create the desired photon-phonon pairs using a blue-detuned 
‘write’ pulse that is sufficiently weak to minimize the effects of residual 
absorption heating (FWHM, 28.4 ns; energy, 40 fJ). We find the relevant 
probability to generate a Stokes scattered photon on cavity resonance 
to be p+ 3.0%. Subsequently, a red-detuned ‘read’ pulse (effective 
length, 55 ns; energy of approximately 50 fJ) is injected at a time delay dt 
(see Fig. 3a), resulting in a phonon-to-photon conversion efficiency of 
approximately 3.7% (see Methods). 

We correlate the measured Stokes- and anti-Stokes photons via 
the cross-correlation function g)(An, dt) =P(WR)/[P(R)P(W)], 
which is computed for read and write pulses originating from pulse 
sequences from different trials separated by An iterations (see Fig. 3). 
P(WNR) is the probability for a joint detection of both a Stokes 
(W, ‘write’) and an anti-Stokes (R, ‘read’) photon from these pulses, and 
P(W) and P(R) are the unconditional probabilities to detect either of 
the two photons. For all pair correlations of classical origin, the value 


of g” is bounded by a Cauchy-Schwarz inequality of the form? 
1/2 

£)(0,88) < |g), (0g, w(0)].»inwhich g (O)and g) (0) are 
the autocorrelation functions for the optical and mechanical mode, 
respectively (see Methods). A violation of this inequality*?°”° is an 
unambiguous measure for the non-classicality of the generated pho- 
ton-phonon state. The Cauchy—Schwarz inequality for coincidence 
detection marks a well-defined border between the quantum and clas- 
sical domain. It is based on the fact that the Glauber-Sudarshan phase- 
space function, or P function, is positive definite for every classical 
field. This places a fundamental limit on the relative strength of meas- 
urable cross-correlations versus autocorrelations between classical 
fields. Previous applications of this limit include the distinction 
between the classical and quantum field theoretical predictions for the 
photoelectric effect?®, and the storage and retrieval of non-classical 
states in the collective emission from an atomic ensemble®. A detailed 
derivation of the Cauchy—Schwarz inequality for the case of non- 
stationary fields, as are being used here, is provided in ref. 3. 

We find a clear violation for an extended regime of time delays. 
Figure 3b shows the value of gom at a time delay of 100 ns. For pairs 
emitted from the same pulse sequence (An=0) we find that 


g”)(0,100ns) = 8.0+0.6—0.5 


and 


(2) (2) Te 
Ler ns Oe ata | = 2.09 + 0.23 — 0.16 


so 
1/2 
g°(0,100ns) £]g, oo n5(9)8, rooms )| 
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which obviously violates the classical bound. As expected, pairs emitted 
from different pulse sequences (An #0) are uncorrelated and hence 
fulfil the inequality. Upon increasing the time delay further, we find 
a violation even beyond 6t= 1 1s (see Fig. 3c), which demonstrates 
that we can store and retrieve non-classical states for an extended time 
interval. Nevertheless, the lifetime of these non-classical correlations 
is still much shorter than the lifetime of the mechanical excitations, 
Q/wm 34s. We attribute this to the fact that the dynamics are dom- 
inated by heating caused by absorption of pump photons, which after 
some onset time drives the mechanical system towards a thermal state 
(see Methods). As a consequence, reducing the energy of the write 
pulse further should allow non-classical correlations to be maintained 
for much longer times. In addition, upon further reduction of the 
absorption heating of the read pulse, even higher values for the cross- 
correlation are obtained. 

The cross-correlation is also linked to the autocorrelation of the 
heralded mechanical state. If one considers two-mode optomechanical 
squeezing acting on an initial mechanical thermal state, and if 


g?) >> 1—as is the case in our experiment—then one obtains 


oe) dea 4/ (g°) — 1). The largest value for g?) observed in our 


experiment was gO, 100ns) = 19.6 — 2.8 + 3.9 (using an energy of 


1.7 fJ in the first 30 ns of the read pulse; see Methods). In other words, 


our system should allow for a Hanbury Brown and Twiss experiment 
(2) 

mm,heralded 
this value with the current experimental parameters is difficult without 


a prohibitively large number of pulse sequences. 

In summary, we have demonstrated non-classical correlations 
between single photons and phonons from a nanomechanical resona- 
tor. This is a crucial step towards on-chip photon-phonon quantum 
interfaces, which are relevant for future solid-state based quantum 
information and communication architectures. For example, the 
observed photon-phonon correlation of g?) = 19.6 suggests that con- 
ditional mechanical Fock-state preparation should be possible with 
fidelities exceeding 85% (see Methods). The ability to store and retrieve 
non-classical states over extended storage times that we reported also 
shows that nano-optomechanical resonators are a promising candidate 
for quantum memories. The performance of the system we have 
demonstrated constitutes an improvement of almost two orders of mag- 
nitude on previous lifetimes of stored non-classical single-phonon 
states'®, Finally, photon-phonon conversion on the single particle level 
is required to extend the ongoing efforts on mechanically transduced 
conversion between microwave and optical fields!” into the quantum 


domain". 


with phonons yielding g 0.22. A direct measurement of 
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METHODS 


Device fabrication and characterization. The optomechanical device used 
for this experiment (see Extended Data Fig. 1) is fabricated from a silicon-on- 
insulator wafer, with a device layer thickness of 250 nm and 31m of buried oxide. 
The structures are patterned using an electron beam writer and are then transferred 
into the top silicon layer in an SFs/O2 atmosphere using a reactive ion-etcher. The 
devices are finally released and undercut using concentrated hydrofluoric acid. We 
design the nanobeams such that the fundamental mechanical breathing mode is at 
5.3 GHz (see Fig. 2a) and the optical resonance is around 1,550 nm (the measured 
wavelength for the device used here is 1,556 nm)*!. The optical and the mechanical 
modes are co-localized in the centre of the beam, where we create a defect region of 
the photonic and phononic bandgap, allowing for an optomechanical coupling rate 
80/(2%) = 825 kHz. To minimize the thermalization time to the surrounding bath 
we opted, unlike previous designs, to not use any additional phononic shielding. 
As a consequence, the mechanical quality factors at base temperature are found to 
be about 1.1 x 10° (see Methods section ‘Mechanical response to optical pulses’), 
compared to values greater than 10’ with a phononic shield’”. The laser pulses are 
coupled directly into a tapered waveguide through an optical fibre with a lensed 
tip®’, achieving efficiencies of about 60%. The optical mode of the nanobeam is 
evanescently coupled to the waveguide, which is terminated with a periodic array 
of holes, which acts as a mirror, allowing us to collect the light in reflection. For this 
experiment we chose a critically coupled device (internal losses equal to external 
losses) with an optical linewidth «,/(217) of approximately 1.3 GHz. This places us 
well within the so-called resolved-sideband regime (wm > Ke). 

Set-up. In this section, we provide a detailed description of the experimental 
set-up. It consists of a ‘pump part, a ‘detection part’ and the ‘electronic control 
part’ (see Extended Data Fig. 2). 

Pump part. We use two identical, tunable continuous-wave lasers (New Focus 
6728) as our light sources. The lasers are detuned and stabilized to the blue and red 
side, respectively, of the cavity resonance of the device (1,556.21 nm). The detun- 
ing is set to be the mechanical frequency (5.307 GHz). The two lasers separately 
pass through voltage-controlled tunable optical filters (MicronOptics FFP-TF2, 
free spectral range of about 18 GHz, bandwidth of about 50 MHz) to suppress 
any potential background emissions dispersed in frequency space. To create short 
optical pulses, we modulate the filtered continuous-wave fields using acousto- 
optic modulators (AOMs; IntraAction) and an additional electro-optic amplitude 
modulator (EOM; EOSpace). We use variable optical attenuators (VOAs; Sercalo) 
on each path to control the pulse power. The pulses are combined on a variable 
optical coupler and then sent to the device in the dilution refrigerator (Vericold 
E21) via an optical circulator. At the device (OMC, optomechanical crystal), the 
optomechanical interaction with the blue (red) detuned pulses generates down- 
(up-) converted photons, whose frequency is on resonance with the optical cavity 
frequency of the device. The scattered photons are reflected back from the OMC 
into the optical fibre and routed to the detection part through the output port of 
the circulator. 

Detection part. Two voltage-controlled optical filters (MicronOptics FFP-TF2, 
specification as above) are installed in series at the beginning of the detection 
path. These filters are tuned on resonance with the OMC cavity frequency such 
that they only allow (anti-) Stokes scattered photons to be transmitted, while strong 
off-resonant pump photons are rejected (suppression of about 84dB). After the 
filters, a 50:50 beam splitter divides the path. Each output is additionally filtered 
by broadband wavelength-division multiplexors (WDM), and fibre-coupled to 
two superconducting nanowire single-photon detectors (SNSPD; PhotonSpot, 
detection efficiency of about 90%, dark count rate of <10 Hz). The SNSPDs are 
mounted on the 1-K plate inside the dilution refrigerator. Upon receiving a photon, 
the SNSPD generates a brief voltage spike, which is then electrically registered by 
a time-correlated single-photon counting module (TCSPC; PicoQuant TimeHarp 
260 NANO). The overall efficiency of detecting a photon leaving the OMC is 
approximately 2.7% (see below). 

Control part. To generate programmable optical pulses and to detect photons syn- 
chronously, we use a digital pulse generator (DPG; Highland Technology P400) 
and an arbitrary waveform generator (AWG; Agilent Technologies 81180A). We 
first program the DPG to generate a TTL (transistor-transistor logic) gate voltage 
signal for the AOM on the read (red) path and to trigger the TCSPC synchro- 
nously. The DPG additionally triggers the AWG, which then generates a TTL 
gate voltage for the AOM and a voltage pulse for the EOM on the write (blue) 
path. 

Mechanical response to optical pulses. Over the past few years, several 
experiments have demonstrated precise control over optical and mechanical 
states through continuous optomechanical driving, including coherent state 
transfer”°?34 and microwave-to-optics conversion!**>°, Owing to the unavail- 
ability of the regime of single-photon strong co-operativity, strong drive fields 
have to be used to achieve the desired coupling strength*”. This leads to unwanted 
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heating effects, in particular in the optical domain. Because the mechanism of 
optical absorption couples only indirectly to the mechanical mode of interest*”, 
using short optical pulses as non-stationary drive fields can substantially suppress 
the heating on short timescales—in particular at low temperatures!”. 

Here, we probe the thermal response of the mechanical mode by pump- 
probe-type measurements; we first send a short blue-detuned pump pulse 
onto the OMC cavity to intentionally heat the mode, and subsequently inject 
a red-detuned probe pulse to read out the phonon occupancy of the mode. By 
repeating the experiment with varying time delay between the pump and the 
probe pulses, we monitor time-dependent evolution of the phonon occupancy 
of the mode with a fixed initial impulse heating. The time delay ét is defined as 
the delay between the end of the pump (blue) pulse and the start of the probe 
(red) pulse detection window, as indicated in Fig. 3a. In that way, the probing 
is performed after the optical absorption of the pump photons is completed. 
To ensure that the mechanical mode fully re-thermalizes to the bath, we set 
the duty cycle of sending another blue pulse after the red pulse to be one milli- 
second. To improve the signal-to-noise ratio, the pulse energies of the blue 
(200 fJ) and red pulses (2 pJ) used here are substantially larger than in the 
cross-correlation measurements. 

The effective mode temperature is inferred from the average count rate 
observed after sending the red pulse (Cg). Cp can be decomposed into three terms: 
(1) the rate proportional to the (on-resonance) anti-Stokes Raman scattered pump 
photons (Cs); (2) a term corresponding to pump photons leaked through the 
optical filters (Cyeak); and (3) an additional anti-Stokes scattering term due to 
heating (ref. 17) of the mode during the read-out pulse (Cyeat). We minimize 
Cyeat by taking into account only the first 30 ns of the red pulse as the ‘logical’ red 
pulse. Cheak gives a constant offset to the signal. Cas directly reflects the effective 
temperature of the mode, because the anti-Stokes scattering rate is proportional 
to the average number of phonons (nm) in the mode (see Fig. 2a). To that end, we 
deduce that Ca(5t) = Cas (8t) + Cheak = Mm (5t) + Cheak) in which a is the constant 
of proportionality. 

The long-term response of the mechanical mode to the initial blue pump pulse 
is shown in Extended Data Fig. 3a. It exhibits an exponential decay with a time 
constant of Tj = 34.4 1s, which is interpreted as the mechanical damping time. The 
corresponding mechanical quality factor is then Q=wmTj~ 1.1 x 10°. 

In addition, we probe the short-term response of the mechanics within one 
microsecond after the blue pulse in more detail (Extended Data Fig. 3b). We 
observe an increase in Cr with a time constant of 0.37 1s (fit to a simple expo- 
nential curve). These data reveal slow turn-on dynamics of pulse-induced heat- 
ing, as previously studied’”. This time constant is even shorter than the decay of 
the cross-correlations (see Fig. 3c), which we attribute to the increased thermal 
conductivity of silicon at higher temperatures, caused by absorption of increased 
optical pump energies. 

Characterization of the detection scheme. Detection efficiency. We first calibrate 
the fibre-to-chip coupling efficiency (7-) by sending in light far off-resonant from 
the OMC cavity, and then measure the reflected power (74. = 60.3% one-way). 
The device impedance ratio (7.), that is, the ratio of external coupling losses Kext 
to total losses x, is measured through the depth and the linewidth of the optical 
resonance, and is found to be 7. = Kext/Ke= 0.5. The detection efficiency of scat- 
tered photons for each detector (1, i= 1,2) consists of 74, 7, the total losses of the 
remaining detection paths (7patn,;) and the quantum efficiencies of the SNSPDs 
(nNgg,i). To measure 7;, pulses with calibrated energy are sent off-resonantly to the 
OMC (Pn), and the reflected photons transmitted through the optical filters are 
detected by the SNSPDs (Pout). Pin/Pout corresponds to nfe X Me X Npath,i X NE’ 
which we measure to be 0.013 for SNSPD1 and 0.019 for SNSPD2. Therefore, we 
deduce j= neNfcNpath,iQz,i to be m = 1.1% and 7 = 1.6%. The detection efficiency 
of SNSPD1 (characterized quantum efficiency ngg,; = 65%) is lower than SNSPD2 
(characterized quantum efficiency gx, = 90%), because we needed to reduce the 
bias current to prevent the detector from latching*®. This latching is probably 
caused by a nearby heater of the dilution refrigerator. It also results in a slow drift 
in the quantum efficiency of SNSPD1. The deduced npatn,; come from the various 
optical elements in the beam path of the detection part and are in good agreement 
with their specified insertion losses. 

Scattering rates and optomechanical coupling rate. From the total detection 
efficiency of resonantly generated cavity photons, we estimate the pair gener- 
ation probability per write pulse (optical energy Eo, 40 fJ) to be p+ 3.0%, 
which includes the effects of a finite starting temperature and leaked pump 
photons. The latter is calibrated by sending detuned optical pulses (Ep 40 fJ, 
Wy = We — Wm — 27 X 200 MHz) to the device. The generated optomechanical side- 
bands are now blocked by the filters and only leaked pump photons are detected. 
We measure a suppression of the pump pulse by 84dB compared to an on- 
resonance transmission. Thus, approximately 1 out of 25 photons detected during 
the write pulse is a leaked pump photon. Given the scattering rate and the energy 
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of the detuned pump pulse, we determine the single-photon coupling rate of our 


OMC to be 
gpa Oe {8 om x ga5kHe 
Ox \\ 2mwm 


With this coupling rate, we estimate the state-transfer efficiency of the red- 
detuned optical read-out pulse of Eop;=50fJ to be ep = 3.7%, where 
Goptout © «1 — ER Goptin + e'®, [ER Amech,in- Here, Goptin(out) are the annihilation 
operators of the temporal optical input (output) mode of the cavity resonance, 
Gmech,in is the mechanical mode before the interaction and ¢ is an arbitrary, but 
fixed, phase between the inputs” 

Definition and properties of the second-order correlation function. We define 
the normalized second-order correlation function for two, not necessarily different, 
modes a and (3, and the respective annihilation operators @, and dg, to be (refs 3, 
40 and references therein) 


ga ete 

92) = ay a4 pag ) 
7) ata Peay 
me (@),4o)( 5 3) 


where: O: denotes normal ordering of the operators. For the autocorrelation of 
the optical field (photons scattered by the write pulse), a = G=0, for the mechan- 
ical field a= (=m and for the cross-correlation a= 0, 3=m. By introducing 
effective modes ¥ and 6 it can be seen that this correlation function is independent 
of losses in the detection. Assuming the loss angles y, and yg for detection of 
modes «and /3, respectively, we define the annihilation operators of the effectively 
detected modes y and 6, 44/5 = cos(Q, /3)4a/a + sin(Q, /g)la/as by coupling 
the original modes a and (3 to modes /, and 1s, represented by the annihilation 
operator! /- As the detected modes 4, 6 have frequencies in the optical domain, 
we assume the in-coupled modes . K g to be in their respective ground state. 

Tracing over /,,, 1, we find that e. =e) ) that is, that the second-order correlation 
function is independent of ligeees or, in “ihe case of the mechanical ae of ‘inef- 
fective’ partial state-transfer to the cavity mode. Thus, for example, ge is equiv- 


alent to the autocorrelation of the photons scattered by the read pulse. 

For autocorrelation measurements, we use a Hanbury Brown and Twiss set-up 
by splitting the mode on a symmetric beam-splitter and sending it to a pair of 
detectors. We define the modes detected by the individual detectors d,, d, with 


their annihilation operators a 2 = cos(@)@q + sin(@)/y, in which 0 is the splitting 
angle of the beam-splitter and ig is the annihilation operator of the second input 


of the beam-splitter. The input state can, as before, be approximated to be in its 
vacuum state. We find that the as as of mode a is equal to the cross- 


correlation of the two detectors: eS = =). For a definition in terms of probabil- 
ities, see below. 
Statistical analysis. No statistical methods were used to predetermine sample size. 

Owing to the low detection probability, the uncertainty in the estimation of the 
second-order correlation functions is completely dominated by the estimation of 
the coincidence rate (: a’, 4, 3 4g :) of the two modes a, (3. As the absolute num- 
ber of coincidences is low in some measurements, Gaussian statistics cannot be 
used for estimating uncertainties. Instead, we use the likelihood function based on 
the binomial distribution for estimating the probability p of the underlying process; 
that is, to obtain N counts in T tries L(p, N, T) = pN(1 — p)" N/K. 

The normalization K is chosen such that f L(p, N, T)dp = 1. The upper and 
lower uncertainty 0 and o_ are chosen numerically, such that they cover a 68% 
confidence interval around the maximum likelihood estimator py = N/T: 


Sf?" L@.N, T)dp = 0.16, en _L(p,N, T)dp = 0.16. 
ML 
For the classical bound on the cross- seen = |g! go (2) ge (2) , the likelihood 


functions of the individual autocorrelations are convoluted. Owing to their asym- 
metry, the maximum likelihood estimator of the classical bound is slightly 
lower than when using the individual maximum likelihood estimators: 


ScbMi S an ML Lae 


As estimators for the cross-correlation function, the probabilities P of a coinci- 

dent or single detection event during the read (R) and write pulse (W) were 
wey (2). ota of peck 

used, with g! = P(W ™ R)/[P(R)P(W)]. This is valid for low event probabilities 
P <1. Autocorrelations were estimated using probabilities of coincident and sin- 
gle detection events on individual SNSPDs (1, 2), ra = P(X) X2)/[ P(X) P(X), 
during the evaluation periods of the write (y= 0, X= W) and read (v=m, X=R) 
pulse, respectively. The statistics of the cross-correlation measurements are sum- 
marized in Extended Data Table 1. 


For the read pulse, only the first 55 ns of the pulse were evaluated (see Fig. 3). A 
further reduction of the evaluation period to teyai= 30 ns (R*) has the advantage 
of reducing the influence of optical absorption of pump photons from the read 
pulse, while still obtaining solid statistics for the cross-correlation, 
g)(An = 0, 5t = 100 ns, teva = 30 ns) = 19.6 — 2.8 + 3.9. However, this reduc- 


tion also results in a much lower state transfer efficiency eR 


= 0.1% (compared to 


ER= 3.7% above; see Methods section ‘Characterization of the detection scheme). 
As a consequence we cannot obtain independent statistics on the autocorrelation 
function of the read pulse, because the number of pulse sequences is too low to 
observe coincidences during the reduced read pulse N(R} M R3) = 0. Thus, no 
independent classical bound g*, can be obtained for this case. Because the meas- 


urement is identical to the one with a longer evaluation window in the first column 
of Extended Data Table 1, it is reasonable to assume the same autocorrelation of 
the mechanical state and, therefore, the same classical limit. 

We note that slight differences in the polarization of the two input lasers and the 
optimal axis of the SNSPDs can lead to different detection rates of leaked pump 
photons between the read and the write pulse. Although this does not influence 
the cross-correlation measurement, it is important to use the same laser source for 
the sideband asymmetry measurements. 

Interpretation of the cross-correlation measurements. Classical bound. The clas- 
sical bound gp, is found to be slightly greater than two, the value expected for a 
thermal state of the mechanical system (see Fig. 3c). Although this increase of the 
autocorrelation is not significant (<3) in our measurements, a behaviour like this 
would be expected in the case of mixed thermal states of different temperatures, 
caused, for example, by fluctuations in the absorbed power. Effects that usually 
decrease the measured autocorrelation function of a thermal state, such as dark 
counts of the detectors and instantaneous heating by the read pulse, do not play a 
major role in our experiment, owing to the choice of pulse parameters. In conclu- 
sion, the classical bound in the present experiment is slightly elevated compared 
to cross-correlation experiments in atomic physics or nonlinear optics, where the 
classical bound is usually assumed to be*! less than two. 

Decay of cross-correlations due to delayed heating. The cross-correlation can be 
interpreted as 


g) eS (1m )n 
(fm) 


in which (1) is the average number of mechanical excitations in the state her- 
alded on a detection event of the write pulse (indicating the presence of an anti- 
Stokes scattered photon), and (nm) is the average number of unheralded events 
(essentially probing the thermal excitation of the system when p < 1). In case of 
a delayed heating, the thermal occupation of the system is a function of the delay 
&t after the write pulse, (”m,th) = (”1m,th) (5t). Assuming our cross-correlation is 
dominated by the thermal occupation, we obtain, fort < Ty 


1+ (itm,th) (8) 
(1tm,th) (4) 


which clearly decays in the case of substantial delayed heating, as is observed here 
(see Extended Data Fig. 3). Theoretical models of the complex thermodynamic 
non-equilibrium processes contain many device-dependent parameters!”, which 
will be subject of further studies. 

Estimation of the heralded single-phonon fidelity. In general, the toolbox of quantum 
optics provides a unique means for quantum state control of various systems”. As 
an example, we discuss the application of single-photon detection for the heralded 
generation of single-phonon Fock states of our mechanical resonator??*"*. To 
estimate the fidelity of the single-phonon state directly after heralding on the detec- 
tion of a resonant photon generated by the write pulse, we need to know all the 
contributions to the diagonal of the density matrix, which are not a single phonon. 
These contributions can either be higher excitations by thermal contribution, mul- 
ti-pair generation, or vacuum states by false-positive heralding events. Higher 
excitations can be estimated using the autocorrelation function of the heralded 
state, which is related to the cross-correlations function*®. Because we aim to esti- 
mate the state immediately after heralding it, we reduce the evaluation window of 
the read pulse as much as possible, while maintaining reasonable statistics on the 
cross-correlation (see Extended Data Table 1). From the measured 


ga (An = 0, St = 100 ns, feyai = 30 ns) = 19.6 — 2.8 4 : : we infer an autocorre- 
anmnsHevalded = 0.22+0.04, 


which approximately relates to the ratio of probabilities of higher excitations (p,,5 1) 


@) (6t) 


lation function for the heralded mechanical state of ¢ gl) 


to single phonon excitations (p, =): Pris & (2) . In the meantime, 


§ mm heralde n=1 
the main contribution for non-zero p,, =o (the probability of the heralded mechan- 
ical state being the ground state) is false-positive heralding events, that is, dark 
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counts and leaked pump photons. With the known ratio of true-positive to 
false-positive heralding events (see Methods section ‘Characterization of the detec- 
tion scheme’), we obtain an estimate of p, =o * 1/25. With these conservative 
estimates, we obtain a heralded Fock-state fidelity of p, — ; = 87.7% + 1.2% on the 
basis of the standard system Hamiltonian of the optomechanical device’. 
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Extended Data Figure 1 | Optomechanical device. Shown is a scanning from the left of the image. The field then evanescently couples to each 
electron microscope image of a set of nanobeams, which are fabricated nanobeam (top and bottom). The two devices have slightly different 
in silicon, as described in the text. Light is coupled into the central, resonance frequency, which makes it possible to distinguish them. 


adiabatically tapered waveguide through a lensed optical fibre (not shown) 
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Extended Data Figure 2 | Detailed experimental set-up. See Methods section ‘Set-up’ for a description. 
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Extended Data Figure 3 | Pump-probe measurement of the mechanical are fitted to a simple exponential curve (green dashed line; see equation in 


response. We send in a brief, intense blue-detuned optical pulse (pump) and __ the plot). The fitted time constant (74) is 0.37 1s. The fit curve of the 
measure the mechanical response via a red-detuned optical probe pulse as a long-term response (red dashed line) projected to 0-\1s delay is also 


function of pump-probe time delay (5t). a, Long-term mechanical response. shown for comparison. Because the pump-pulse energies were five times 
The result fits well with a simple exponential decay (red dashed line; see stronger than those of the write pulses in the correlation experiment, it is 
equation in the plot) with a damping time constant (Ty) of 34.4418. The inset expected that the delayed heating occurs on a longer timescale, owing to the 
shows the same data/fit with a logarithmic scale on the x axis. Cas, is the temperature dependence of the thermal conductivity of silicon**. Error bars 
extrapolated Cas (6t=0). b, Short-term mechanical response. The data ina and b represent a 68% confidence interval. 
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Extended Data Table 1 | Counts of the cross-correlation measurements 


ot 0.1 Us 
N(RNW) 202 
N(RiNR») 13 
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N(Ri) 13,172 
N(Rp) 17,490 
N(W1) 12,471 
N(W>) 19,409 
T 38,806,017 
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35,712,159 


3.1 Us 
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23,629 
31,892 
15,032 
23,601 
47,964,927 


total 


80 


64,485 
100,131 
201,271,242 
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16 

966 

1145 
12,471 
19,409 
38,806,017 


The row labels (event) represent the number of counts for a certain event, for example, detection of a photon during the measurement window of the read pulse on detector 1 or 2 (Ri,2), or the 
coincidence of a detection event of a subsequent write and read pulse on either detector 1 or 2 (RM W=(R; UR2)N(W; UW2)). T denotes the total number of pulse pairs sent to the optomechanical 
device. For the calculation of the autocorrelation function of the read pulse, only counts from the delay setting &t are used, because the delayed heating of the blue pulse (see Extended Data Fig. 3) 
influences the mechanical state. For the autocorrelation function of the write pulse, counts from all delay settings are summed, because the mechanical state is reinitialized by cryogenic cooling 

before measurement, independent of the delay 6t. The numbers for this are summarized in the column labelled ‘total’. The highest reported cross-correlation value was obtained by reducing the 
measurement window of the read pulse from 55 ns to 30ns, with a delay of &t= 100ns between the write and the read pulse. The counts for this evaluation window are presented in the column labelled 
‘0.1 us*’. The underlying data set is the same as for the standard evaluation period of 55 ns, shown in the first column. 
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Uranium -mediated electrocatalytic dihydrogen 


production from water 


Dominik P. Halter!, Frank W. Heinemann!, Julien Bachmann! & Karsten Meyer! 


Depleted uranium is a mildly radioactive waste product that 
is stockpiled worldwide. The chemical reactivity of uranium 
complexes is well documented, including the stoichiometric 
activation of small molecules of biological and industrial 
interest such as H.O, CO,, CO, or N2 (refs 1-11), but catalytic 
transformations with actinides remain underexplored in 
comparison to transition-metal catalysis'”"'*. For reduction of 
water to H2, complexes of low-valent uranium show the highest 
potential, but are known to react violently and uncontrollably 
forming stable bridging oxo or uranyl species’». Asa result, only 
a few oxidations of uranium with water have been reported so 
far; all stoichiometric? !©!”, Catalytic H, production, however, 
requires the reductive recovery of the catalyst via a challenging 
cleavage of the uranium-bound oxygen-containing ligand. Here 
we report the electrocatalytic water reduction observed with 
a trisaryloxide U(111) complex [((44™*ArO)3mes)U] (refs 18 
and 19)—the first homogeneous uranium catalyst for H, production 
from H,O. The catalytic cycle involves rare terminal U(1v)-OH 
and U(v)=O complexes, which have been isolated, characterized, 
and proven to be integral parts of the catalytic mechanism. The 
recognition of uranium compounds as potentially useful catalysts 
suggests new applications for such light actinides. The development 
of uranium-based catalysts provides new perspectives on nuclear 
waste management strategies, by suggesting that mildly radioactive 
depleted uranium—an abundant waste product of the nuclear 
power industry—could be a valuable resource. 

To evaluate the electrocatalytic H2O reduction with our U(111) 
complex, we performed reference experiments for uncatalysed HO 
reduction under identical conditions. As expected, the inert glassy 
carbon electrode possesses a considerably higher overpotential (77) 
for water reduction than does a platinum (Pt) electrode, as pre- 
sented in the cyclic voltammograms in Fig. 1a (0.22 M H20 in THF). 
We determined the thermodynamic potential for Hz production 
under these conditions at — 1.434 V versus Fc*/Fc using open-circuit 
potential (OCP) measurements (for details, see Supplementary 
Information). Cyclic voltammetry shows the onset potential for Hy 
evolution at —2.25 V (versus Fc*/Fe, with 7 =0.82 V) for a Pt electrode 
(red trace), whereas the onset potential is shifted to —3.25 V (versus 
Fct/Fc, 7 = 1.82 V) for a glassy carbon electrode (black trace). The 
homogeneous uranium catalyst described and analysed in this study 
is the mesitylene (mes) anchored trisaryloxide uranium(II) complex 
[((A4¢@©4rO)3mes)U] (1), which incorporates bulky adamantyl (Ad) 
groups at the ortho positions and methyl (Me) groups at the para posi- 
tions of the aryloxide arms (ArO). The addition of a minute amount 
of 1 (8.9 x 10-4M, 0.4mol%) to the electrolytic solution lowers the 
onset potential by 500 mV to —2.75 V (versus Fc*/Fe, 7 = 1.32 V) at the 
glassy carbon electrode, and substantially increases the current density 
from —9.38 pA cm~* to —0.232 mA cm~? (at —3.45 V versus Fc*/Fc, 
= 2.02 V) at the glassy carbon electrode (blue trace). 

A similar trend is observed in bulk electrolysis experiments, in 
which the catalyst 1 induces a substantial increase in the steady-state 


current of the water reduction by a factor of ten, from —17.6 A to 
—174,A at a potential of —3.25 V (versus Fc*/Fc, 7 = 1.82 V). A series 
of bulk electrolyses at various potentials confirmed the catalytic activ- 
ity of 1, because the charge increases more rapidly at more negative 
potentials in the presence of 1 (Fig. 1b). The Faradaic yield of the 
H; evolution was determined to be close to 100% using a GC-TCD 
(gas chromatography with thermal conductivity detector; for details, 
see Supplementary Information). Control experiments using either 
[UI;(THF),4], a common precursor for trivalent uranium complexes, 
or pure ligand show no catalytic activity whatsoever (see Fig. 1b). In 
fact, the presence of uranium triiodide impedes HO reduction at very 
negative potentials. This proves that the catalytic activity is specifically 
due to compound 1, and cannot be explained solely by the presence 
of a low-valent uranium ion. In addition, we excluded the formation 
of suspended uranium nanoparticles or an electroactive film on the 
electrode by multiple reference experiments, including electrolysis 
with added Hg (see Supplementary Information). 

Electrochemical impedance spectroscopy (EIS) of the H,O reduc- 
tion allows for a quantification of the catalytic effect, because the 
charge transfer resistance determined by EIS is reduced by three 
orders of magnitude, from 1.23 MQ to 3.70kQ, at —3.25 V (versus 
Fc*/Fc), upon addition of 1 (Fig. 1c, d). A foot-of-the-wave analysis 
yields a turnover frequency of 10°h™' at an overpotential of 1.3 V, 
the observed onset potential for our catalyst, whereas state-of-the-art 
homogeneous transition-metal H2-evolving catalysts are reported 
with turnover frequencies ranging from 1h“ to 1.9 x 10”h7 (ref. 20) 
for varying overpotentials. Important examples of electrocatalytic 
H; evolution include a nickel catalyst with the triflate salt of proto- 
nated dimethylformamid (DMEF) in acetonitrile, a cobalt catalyst 
with anilinium tetrafluoroborate in DMEF”, a molybdenum oxo”, 
or a cobalt-based catalyst in buffered aqueous phosphate solution”. 
These diverse reagents illustrate the difficulties in comparing indi- 
vidual catalysts under different conditions. Comparison with other 
systems, where TOF values are not provided, can be made on the 
basis of overpotential decrease, Faradaic yields (usually between 85% 
and 100%), and an increase of the reductive current density at a suit- 
able potential for uncatalysed H; evolution, which, for the majority 
of catalysts, leads to a 5- to 15-fold increase, but can reach up to 
40-fold*?>~?’, The reported catalyst 1 reduces the overpotential by 
0.5 V with respect to the glassy carbon electrode, has a Faradaic yield 
of 100%, a 25-fold reductive current increase at 7 =2 V, and a turnover 
frequency of 10°h™! at 7 =1.3 V. 

To provide a direct comparison with a recently published transition- 
metal system, we tested a Mo-based catalyst, [(PY;sMe2)MoO]?", 
(provided by J. Long, C. Chang, and D. Zee) under the conditions used 
for our catalyst 1. The current—potential curves (see Supplementary 
Information) of both systems are very similar. 

To gain further insight into the catalytic reactions, and in particular 
to highlight the influence of bulky ligands for steric protection on the 
reactivity of U(111) complexes with HO in general, we carried out 
stoichiometric reactions. 


1Department of Chemistry and Pharmacy, Inorganic Chemistry, Friedrich-Alexander University Erlangen-Nurnberg (FAU), Egerlandstrasse 1, D-91058 Erlangen, Germany. 
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Figure 1 | Electrochemical characterization of catalyst 1. a, Cyclic 
voltammogram of the electrochemical H,O reduction in THF with 0.1M 
TBAPF,g (2011 HO in 5 ml THE, 0.22 M) on a glassy carbon electrode, 
without catalyst (black) and with 0.4 mol% catalyst 1 (blue). The onset 
potential for the water reduction is reduced by 0.5 V, and the reductive 
current density at the vertex potential increases from —0.027 mA cm * 
to —0.382 mA cm ~* after addition of the catalyst (5 mg, 0.4mol%). For 
comparison, the H,O reduction on a platinum electrode under similar 
conditions is shown (red). E is the potential, measured in volts. b, Plot 
of the charge Q passed during a 300-s electrolysis per run at different 
potentials E for uncatalysed H3O electrolysis (black), in the presence of 


Stoichiometric oxidation of the U(11) complex [((A4MeA rO)3mes)U] (1) 
in THF or toluene with H,O (0.05 M in THF) at room temperature results 
in the formation of the U(1v) complex [((44M"41+O)3mes) U(OH)(THE)] 
(2-OH) (Fig. 2a) and H:, which was identified and quantified by 
GC-TCD experiments (see Supplementary Information). Single crys- 
tals of 2-OH suitable for single-crystal XRD (X-ray diffraction) anal- 
ysis were obtained as light-green needles at —35°C. The tetravalent 
2-OH features a six-coordinate uranium ion in its solid-state molec- 
ular structure (Fig. 2b), with four oxygen atoms in the equatorial 
plane—the three aryloxide ligands with an average U-Og, bond of 
2.188 A, and a coordinated molecule of THF with a U-Ory p distance 
of 2.616(2) A (the number in parentheses, here and elsewhere, is an 
estimate of the standard deviation derived from full-matrix least- 
squares refinement). The two axial positions are occupied by the 
hydroxo ligand and the arene backbone of the chelator. This coordi- 
nation geometry and bonding is similar to the analogous U(tv) halide 
complexes [((A¢MeArO)3mes) U(X)(THE)] (with X =F, Cl, Br, 1)”8. 
A U-arene centr, distance of 2.703 A and an average uranium-carbon 
bond length U-Cay(ay) of 3.047 Aare in agreement with a uranium-arene 
6-backbonding interaction. 

The U-OH distance of the axially bound hydroxo ligand to the ura- 
nium centre is 2.106(2) A, which is in good agreement with those of 
comparable U(1v)—OH species*!*’”. The hydroxide hydrogen position 
was derived from a difference Fourier synthesis and was treated using a 
riding model in the subsequent refinement cycles. Also, a hypothetical 
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—3.25 V (versus Fc*/Fc). The charge transfer resistance is three orders of 
magnitude greater in the uncatalysed reaction than it is in the catalysed 
reaction, demonstrating the catalytic effect of compound 1. 


terminal oxo ligand can be excluded, because U=O bonds are known to 
be much shorter, typically ranging from 1.818 A to 1.859 A (refs 29, 30). 
Infrared spectroscopy was used to confirm directly the identity of 
the U-OH group in 2-OH. In a solid KBr matrix, complex 2-OH 
exhibits two sharp O-H stretches with stretching frequencies, v(OH), 
centred at 3,659cm~! and 3,630cm~!, probably due to the presence 
of two rotamers of the OH ligand (Fig. 2c). For an unambiguous 
assignment of the v(OH) stretches, the selectively deuterated com- 
plex [((44MeArO)3;mes)U(OD)(THE)] (2-OD) was synthesized from 
D,0. As expected, the v(OD) bands were found at lower frequencies, 
2,718cm~! and 2,677 cm~ |, which is in accordance with the theoreti- 
cally predicted bathochromic shifts of 990 cm7! based on a harmonic 
oscillator model. 

The 'H NMR spectrum of the paramagnetic U(1v) f? complex 
2-OH in C.6D¢ features the nine signals expected of the 
(A¢MeArO)3mes>~ ligand between 22.79 p.p.m. and —5.61 p.p.m., 
as well as two paramagnetically shifted signals of the coordi- 
nated THF (see Supplementary Information). The signal pattern is 
almost identical to the previously reported U(rv) fluoride complex 
[((A4MeA rO)3mes)U(E)(THE)], for which a dynamic dissociation 
equilibrium of the THF ligand was shown to yield the observed pseudo 
C3, symmetry”®. The OH proton was not detected, which is probably 
due to paramagnetic line broadening. 

The visible-near-infrared electronic absorption spectrum of 
2-OH in THF displays eight absorption bands between 537 nm and 
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Figure 2 | Independent synthesis and characterization of the 
uranium(IV) hydroxo complex [((4¢™*ArO);mes)U-OH] (2-OH). 

a, Synthesis of 2-OH with concomitant H, evolution. b, Molecular 
structure of the crystallographically characterized complex 2-OH 

in crystals of C57Hg4O5U - 3(C4HgO), with thermal ellipsoids at 50% 
probability. All hydrogen atoms except for the hydroxo H were omitted for 
clarity. c, Infrared vibrational spectra of 2-OH (black) and its isotopomer 
2-OD (blue), showing the expected isotopic shift for the O-H stretching 
vibration v. The inset is a close-up of the 2-OH spectrum, showing the 
two OH stretching frequencies at v = 3,659cm~! and v=3,630cm 1. 
1,886 nm with extinction coefficients, ¢, ranging from 61 mol-!cm~! 
to 231mol~'cm™! (see Supplementary Information). The pattern 
and fine structure of the ff transition bands compare well with other 
U(1v) complexes of the [((A4MeArO)3mes)U(X)(THE)] system?®, 
Temperature-dependent SQUID (superconducting quantum inter- 
ference device) experiments unambiguously characterize 2-OH as a 
U(1v) species with magnetic moments, te, of 0.54 4p at 2 K and 2.62 Ug 
at 300 K (in which jug is the Bohr magneton), as well as a temperature 
dependence reminiscent of the [((A¢MeA1+O)3mes)U(X)(THE)] series 
mentioned above (see Supplementary Information)”. 

After identification of the reaction product 2-OH, we analysed the 
mechanism of its formation in situ using low-temperature X-band 
EPR (electron paramagnetic resonance) spectroscopy. This first 
mechanistic study of the fundamental reaction between H2O and a 
molecular U(111) species provides basic chemical insight relevant to 
catalytic applications and nuclear waste treatment strategies. On the 
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Figure 3 | Postulated mechanism for the reduction of H2O by the U(111) 
complex 1, based on EPR results. The addition of H2O to 1 probably 
yields a U(111) aquo species, which forms a fleeting U(v) hydroxo-hydrido 
intermediate, [((44™*ArO)3mes)U(OH)(H)], by intramolecular insertion; 
this hydroxo-hydrido species then decays to a U(v) oxo species by 
elimination of H) (reaction (1)). Subsequently, the U(1v) hydroxo complex 
2-OH is formed in a comproportionation reaction between the U(v) oxo 
and the U(111) aquo species (reaction (2)). In the net reaction, two U(111) 
aquo complexes form two molecules of 2-OH and one equivalent H). 


LETTER 


basis of the results, a mechanism is postulated (Fig. 3) in which a 
U(111) aquo species undergoes an intramolecular insertion to form a 
U(v) hydroxo-hydrido intermediate. This intermediate then releases 
one equivalent of H2 to form a U(v) oxo complex (Fig. 3, reaction 1). 
This proposed mechanism compares well to the reactivity reported 
for a molecular molybdenum-based catalyst for H2O reduction, 
[PYsMe.MoO}?* (ref. 23, 31). In step 2 of the proposed mechanism, 
the U(v) oxo species comproportionates with a U(111) aquo complex to 
form two equivalents of the U(1v) hydroxo 2-OH (Fig. 3, reaction 2), 
a reaction that has been confirmed experimentally (see below and 
Supplementary Information for details). 

To elucidate this mechanism, we performed time- and tempera- 
ture-dependent EPR experiments with a reaction mixture of 1 and 
H,O in a frozen toluene solution at 7.5 K (Fig. 4). Initially, a spec- 
trum of the neat U(11) f° starting material (10 mM) in toluene was 
recorded, yielding an almost axial signal with g values centred at 
1.56, 1.48, and 1.20 (see Supplementary Information), as expected for 
[((44MeArO)3mes)U] (1)!%. In the following measurement, a mixture 
of 1 (10mM) in toluene with a sub-stoichiometric amount of HO 
(0.375 equiv.) was prepared. Under these dilute conditions the reaction 
takes about 2 h at room temperature for completion. Hence, the sample 
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Figure 4 | X-band EPR spectrum of a frozen 10 mM toluene solution 
of 1 with a sub-stoichiometric amount of H,O. The EPR data show a 
convoluted spectrum of two species: the U(11) starting material and a 
well-defined rhombic species, tentatively assigned to the fleeting U(v) 
hydroxo-hydrido species. Experimental conditions are as follows: 
temperature T= 7.5K, frequency v = 8.96286 GHz, power P=1mW, 
modulation width of 1.0 mT. The experimental spectrum (black) and 
simulation (red) under these conditions are shown. The best fit for the 
experimental spectrum is a convolution of the signal of 1 in toluene 
(simulated, green; g values at g, = 1.56, go = 1.48, g3 = 1.20, with line 
widths of W; =21.4mT, W,=30.5 mT, W;= 14.4 mT; relative weight 

of 1.0) and the signal of an additional, rhombic transient U(v) species 
(simulated, blue; g values at g; = 2.73, go = 1.83, g3 = 1.35, with line widths 
of W, = 18.9 mT, W2 = 25.5 mT, W3= 26.5 mT; relative weight of 0.70). The 
spectra are offset for ease of viewing. 
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was allowed to equilibrate for 5 min at room temperature and then 
flash-frozen in liquid nitrogen to trap potential intermediate species 
in a frozen solvent matrix. Indeed, we obtained a convoluted spectrum 
of at least two species: the U(111) starting material and another, well- 
defined rhombic species with simulated g values at 2.73, 1.83, and 1.35, 
consistent with an intermediate U(v) f | species (Fig. 4). 

This rhombic U(v) species is tentatively assigned to a hydroxo- 
hydrido complex, formed by an intramolecular insertion reaction 
from an intermediate U(111) aquo species, as shown in Fig. 3. At 
7.5K, the U(v)-to-U(mI) ratio increases with time until it reaches 
saturation after 10 min, which strongly suggests an intramolecular 
reaction. Starting from a U(111) aquo complex, this intramolecular 
reaction is likely to be the oxidative addition of the aquo ligand to 
yield the observed U(v) species that is the proposed U(v) hydroxo- 
hydrido complex. A hydroxo-hydrido complex would accommodate 
an additional ligand in the equatorial position, reminiscent of the 
labile coordinated THF in 2-OH. Uranium complexes bearing the 
arene-anchored chelator (A¢M°ArO)3mes*~ do support the additional 
ligand in the equatorial plane, in contrast to other trisaryloxide chela- 
tors. This may be the reason why the closely related triazacyclononane 
derivative, [((4¢8"ArO)3tacn)U] (see Supplementary Information), 
does not show any catalytic activity for H2O reduction. Furthermore, 
the electronic influence of the uranium-mesitylene 6-backbonding 
cannot be excluded; thus, identifying its importance is part of ongoing 
studies. 

When the EPR sample is thawed, the U(v) hydroxo-hydrido com- 
plex that has accumulated decomposes thermally to form a U(v) 
oxo, which then comproportionates with the U(111) aquo complex 
to yield two equivalents of the EPR-silent U(1v) hydroxo species 
2-OH (Fig. 3, net reaction). Studies of the molecular and electronic 
structure of the U(v) oxo complex [((44MeArO)3mes)U(O)] are part 
of ongoing research; preliminary results yield an axially symmetric 
EPR spectrum, which deviates from the rhombic U(v) signal that 
is observed during the formation of the proposed, less-symmetric 
hydroxo-hydrido species. Hence, the U(v) hydroxo-hydrido com- 
plex is considered stable at 7.5 K in a frozen solvent matrix. Radical 
intermediates, such as H* or OH’, were not detected in the course 
of the reaction, but would have been visible prominently in EPR 
spectroscopy, had they been present. The role of the uranium oxo 
species as an intermediate during the catalytic cycle was further con- 
firmed by the independent, stoichiometric synthesis of 2 equiv. of 
2-OH in a reaction of 1 equiv. of [((A*MearO)3mes)U] (1) with 1 
equiv. of [((44eArO)3mes)U=O], and 1 equiv. of H,O, in quantita- 
tive yield (determined by 'H NMR; see Supplementary Information). 
After completion of the EPR reaction (2h, at room temperature), 
the spectrum shows only signals of unreacted U(1i1) complex 1, 
establishing the U(v) species as an intermediate during the forma- 
tion of 2-OH. The resultant U(Iv) complex 2-OH is EPR-silent, as is 
expected for a U(1v) complex with a 5f? electron configuration and a 
non-magnetic singlet ground state (see above). To complete the anal- 
ysis of the uranium-based electrocatalytic H3O reduction, we investi- 
gated the stoichiometric regeneration of the U(111) species 1 from the 
U(1v) hydroxo complex 2-OH. In addition to the chemical reduction of 
2-OH to 1 with KCg in 'H NMR experiments, square-wave and cyclic 
voltammetry revealed that an electrochemical reduction of 2-OH 
by electrolysis is also possible. 

Square-wave voltammetry experiments of 2-OH reveal a broad 
reduction peak, ranging from —1.5 V to approximately —2.2 V 
(versus Fct/Fc). This redox feature is associated with the 2-OH/2- 
OH~ couple, directly followed by the U(111)/U(11) redox transition at 
—2.5V (versus Fc*/Fc), recently published for 1 (see Supplementary 
Information)!’. The characteristic, reversible U(m1)/U(11) redox cou- 
ple was also observed in cyclic-voltammetry experiments, in which a 
pure sample of 2-OH was electrochemically reduced (using diffusion 
layer electrolysis at —2.2 V versus Fct/Fc), which confirmed the elec- 
trochemical conversion of [((“*™°ArO)3mes)U(OH)(THE)] (2-OH) 
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Figure 5 | Postulated electrocatalytic cycle for H, generation from H,O 
in the presence of the homogeneous U(111) catalyst [((4¢“"ArO)3mes)U] 
(1). Step 1 (top to bottom-right), H, evolution and formation of 

[((A4Me A rO)3mes)U(OH)(THE)] (2-OH) through oxidation of 1 with 
H,0. Step 2 (bottom-right to bottom-left), electrochemical reduction of 
2-OH, forming the transient anion 2-OH~. Step 3 (bottom-left to top), 
elimination of OH” from 2-OH~ to regenerate catalyst 1. 


to [(((A*“©ArO)3mes)U] (1). Accordingly, the reaction involves the 
spontaneous elimination of OH (via U-O bond cleavage) from a 
reduced intermediary anionic complex [((A4MeArO)3mes)U(OH) 
(THF)]~ (2-OH)—a reaction previously unknown for U(1v) com- 
plexes. This reductive splitting of the U-OH bond closes the proposed 
catalytic cycle for Hz production from water. 

In the first step of the proposed electrocatalytic cycle (Fig. 5), 2-OH 
is formed in a chemical reaction by oxidation of 1 with H,O and con- 
comitant release of H2. Under the catalytic conditions (250 equiv. of 
H,0) the formation of 2-OH is effectively instantaneous, in contrast 
to the synthetic conditions (1 equiv. of HO), under which quanti- 
tative conversion of 1 to 2-OH takes 2h. In the second step, 2-OH 
undergoes a one-electron electrochemical reduction to form the tran- 
sient species 2-OH —. The reduction potential of this step, observed 
approximately at —1.9 V (versus Fc*/Fc) in square-wave-voltammetry 
experiments, does not correspond to an overpotential that is sufficient 
for direct proton reduction, but is associated with the 2-OH/2-OH— 
redox couple, which is independent of the subsequent, purely thermal 
dihydrogen-formation step. Finally, in the third step, 2-OH™ elim- 
inates OH to regenerate 1 for another catalytic cycle. We further 
confirmed the presence of 2-OH during the catalytic cycle by adding 
2-OH as the active form of the catalyst, instead of 1, which similarly 
yielded the reported catalytic H2O reduction. 

The efficient electrocatalytic water reduction with a homogenous 
uranium-based catalyst established here demonstrates the potential of 
actinides in catalysis. Future studies will be directed towards ligand 
derivatization for catalyst immobilization on electrode surfaces to fur- 
ther facilitate the (photo)electrocatalytic production of H, from water. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Materials. Compound 1, [((4¢”°ArO)3mes)U], was prepared according to the 
previously described literature procedure!®. Ultra-pure water was obtained from 
a Thermo Scientific TKA - Lab Type HP 4 purification system, and degassed 
before use. All other reagents were acquired from commercial sources and used 
as received. Solvents were purified using a two-column solid-state purification 
system (Glass Contour System), transferred to the glovebox without exposure 
to air, and stored over molecular sieves and sodium (where appropriate). NMR 
solvents were obtained packaged under argon and stored over activated molecular 
sieves and sodium (where appropriate) before use. 

All air- and moisture-sensitive experiments were performed under dry nitro- 

gen atmosphere using standard Schlenk techniques, or carried out in MBraun 
inert-gas gloveboxes containing an atmosphere of purified dinitrogen. The glove- 
box is equipped with a —35°C freezer. All glassware was dried by storage in an 
oven overnight (>8h) at a temperature of >160°C. 
Synthesis of (2-OH) [((4¢”°ArO)3;mes)U(OH)(THE)]. A solution of 
(((A*™*ArO)3mes)U] (30.0 mg, 0.0268 mmol) in THF (5 ml) was treated with 0.9 
equiv. H2O (0.05 M in THE, 490 il, 0.0245 mmol) and stirred at room temperature 
for 2h. The solvent was removed in vacuo, to yield an off-white powder, which was 
stirred in n-pentane (1 ml) for 1h. The precipitate was filtered off over a frit, washed 
with n-pentane (3 x 1 ml), and dried in vacuo, to yield the title compound (2-OH) 
as an off-white powder in 87% yield (28.0 mg, 0.0232 mmol). Single crystals for 
XRD analysis were obtained from n-pentane diffusion into a concentrated solution 
of (2-OH) in THF at —35°C. 

1H NMR shifts (400 MHz, CsDg, room temperature; br., broad; s., singlet; d., 

doublet): —5.61 p.p.m (br. s., 18H), —3.07 p.p.m (s., 9H), —2.38 p.p.m (s., 3H), 
0.04 p.p.m (s., 3H), 1.57 p.p.m (s., 3H), 10.81 p.p.m (s., 6H), 11.06 p.p.m (s., 9H), 
11.95 p.p.m (d., J=11 Hz, 9H), 22.79 p.p.m (br. s., 9H). Elemental analysis for 
C67Hg405U: calculated C, 66.65; calculated H, 7.01; found C, 65.75; found H, 
7.42. Infrared absorption bands (w, weak; m, medium; s, strong; vs, very strong): 
3,659 cm! (m) O-H stretch, 3,630cm~! (m) O-H stretch, 2,903 cm! (vs), 
2,847 cm™! (vs), 1,440cm™! (vs), 1,375 cm! (w), 1,342cm7! (m), 1,315cm7! 
(m), 1,283cm~! (m), 1,232cm7! (vs), 1,209cm~! (s), 1,184cm7! (s), 1,163cm7! 
(s), 1,103 cm™! (m), 1,072cm7! (w), 1,016 cm! (w), 980cm™! (w), 916cm7! (w), 
854cm~!cm7! (m), 819cm~! (s), 800cm™! (vs), 750cm~! (m), 509 cm! (s), 
457cm! (s),412cm7! (m). 
Synthesis of (2-OD) [((44”°ArO)3;mes)U(OD)(THE)]. A solution of 
(((A4¢M©ArO)3mes)U] (50.0 mg, 0.0447 mmol) in THF (5 ml) was treated with 1.0 
equiv. D,O (0.1 M in THE, 447 il, 0.0447 mmol) and stirred at room temperature 
for 2h. The solvent was removed in vacuo, to yield an off-white powder, which was 
stirred in -pentane (1 ml) for 1h. The precipitate was filtered off over a frit, washed 
with n-pentane (3 x 1 ml), and dried in vacuo, to yield the title compound (2-OD) 
as an off-white powder in 72% yield (39.0 mg, 0.0323 mmol). 

1H NMR shifts (400 MHz, CsDg¢, room temperature): —5.39 p.p.m. (br. s., 18H), 
—2.96 p.p.m. (s., 9H), —2.15p.p.m. (s., 3H), —0.03 p.p.m. (s., 3H), 1.30p.p.m. (br. 
s., THE 4H), 2.44 p.p.m. (s., 3H), 3.42 p.p.m. (br. s., THE, 4H), 10.58 p.p.m. (s., 6H), 
10.93 p.p.m. (s., 9H), 11.80 p.p.m. (d., J= 11.0 Hz, 9H), 22.50 p.p.m. (br. s., 9H). 
Elemental analysis for Cs7Hg3DO5U: calculated C, 66.65; calculated H, 7.01; found 
C, 66.71; found H, 7.20. Infrared absorption bands: 2,903 cm™! (vs), 2,849cm7! 
(vs), 2,718cm~! (m) O-D stretch, 2,677cm~! (m) O-D stretch, 2,575 cm! (m), 
1,445cm7! (vs), 1,375cm7! (w), 1,342cm7! (m), 1,315cm7! (m), 1,285cm7! (m), 
1,233cm 7! (vs), 1,184cm! (s), 1,159cm7! (s), 1,103 cm7! (m), 1,069 cm! (w), 
1,018cm™! (m), 980cm~! (w), 916 cm™! (w), 876cm~! (m), 854cm™! (s), 835cm7! 
(s), 806cm~! (s), 746cm~! (m),555cm~! (m), 521 cm! (s), 413cm~! (m). 

NMR spectroscopic measurements. 'H NMR spectra were recorded on a JEOL 
ECX 400 at a probe temperature of 23°C. Chemical shifts 6 are reported relative to 
residual 'H resonances of the solvent in parts per million (ref. 32). 

Electronic absorption spectroscopic measurements. Electronic absorption spec- 
tra were recorded from 200 nm to 2,500 nm (Shimadzu, UV-3600) in the indicated 
solvent at room temperature, and plotted from 440 nm to 2,150 nm to emphasize 
the f-f transitions. 

Infrared vibrational spectroscopic measurements. Infrared spectra were 
recorded on a Shimadzu Affinity-1 CE FTIR instrument from 400 cm! to 
4,000 cm". Solid samples of the compounds were homogenized with excess 
amounts of KBr and a pressed pellet was measured at room temperature. 

EPR spectroscopic measurements. Variable-temperature X-band EPR spectra 
were recorded on a JEOL CW spectrometer, JES-FA200, equipped with an X-band 
Gunn diode oscillator bridge, a cylindrical mode cavity, and a helium cryostat. 
Spectra were recorded at T=7.5 K and simulated with the program W95EPR®™. 
Elemental analysis. CH elemental microanalyses were obtained using Euro EA 
3000 (Euro Vector) and EA 1108 (Carlo-Erba) elemental analysers in the Institute 
of Inorganic Chemistry at Friedrich-Alexander University Erlangen-Niirnberg. 


Gas chromatographic measurements. All gas chromatography measurements 
were carried out on a Shimadzu GC-2010, equipped with a 5-A molecular sieves 
column, and a thermal conductivity detector (TCD), operated with N2 as the car- 
rier gas at 70°C. All samples (10011 each) were injected with a gas-tight Hamilton 
syringe. 

For the quantification of the H2 content in the electrolysis cell head space, 
a calibration curve was prepared from multiple measurements with precisely 
adjusted H2 concentrations in N>. The obtained calibration curve is shown in 
Supplementary Fig. 8. The Faradaic efficiency of the electrocatalytic cycle was 
studied by gas-chromatography-coupled electrolysis experiments, where the 
passed charge was compared to the amount of H2 detected in GC-TCD exper- 
iments. The electrolysis cell was a standard Schlenk tube, sealed with a rubber 
septum, through which all electrodes were passed into the sample solution. After 
the electrolysis, 100 ul of the headspace were injected into the gas chromatograph 
with a gas-tight Hamilton syringe. Although great effort was made to seal the 
electrolysis cell, a Hz leakage of 25% in 20 min was found in time-dependent 
GC-TCD experiments, resulting in an observed Faradaic yield significantly below 
100%. To correct for the leakage, the same cell was equipped with a Pt working 
electrode to electrolyse an aqueous solution of 1.2 M NaOH for comparison. The 
electrolysis potential was adjusted such that the constant electrolysis current was 
identical to the electrolysis current of the electrocatalytic H.O reduction using 
catalyst 1. Indeed, matching H2 contents were found after identical electrolysis 
time (75 min) for both the electrocatalytic cycle with 1, and the platinum cata- 
lysed NaOH electrolysis, strongly suggesting identical Faradaic yields for both sys- 
tems. Qualitative H2 evolution in stoichiometric reactions of 1 with HO to form 
2-OH was also confirmed by GC-TCD experiments. Therefore, the synthesis was 
conducted in a closed reaction vessel, which was sealed with a septum to take a 
sample of the headspace. 
Electrochemical experiments. Electrochemical experiments were carried out 
using a three-electrode set-up with a rotating glassy carbon working electrode 
(3mm in diameter) and platinum rods as counter and reference electrodes. The 
potentiostat was a Metrohm j Autolab Type-III. The entire set-up was placed 
inside a nitrogen-equipped glovebox, and only degassed, unstabilized, and dry 
THE (stored over activated molecular sieves) was used during the experiments. A 
platinized platinum electrode was freshly prepared before use according to pre- 
viously described procedures*’, and handled in N2 or Hz atmosphere during the 
experiments. OCP measurements for the determination of the thermodynamic 
potential for H, formation, as introduced in ref. 35, were measured with a plati- 
nized platinum electrode, a platinum counter electrode, and a Ag wire reference 
electrode in a solution of 0.22 M H2O in THF with 0.1 M TBAPF, under 1 atm of 
H). EIS experiments were carried out using a three-electrode set-up with a glassy 
carbon working electrode, a platinum sheet counter electrode, and a Ag wire ref- 
erence electrode. The potentiostat used for EIS and OCP measurements was a 
CompactStat potentiostat from Ivium Technologies and the impedance spectros- 
copy fits were performed by the instrument's software. All samples were measured 
in 0.1 M electrolyte solutions of TBAPFg (purchased from Sigma Aldrich and used 
without purification) in THE. Reported half-wave potentials were referenced to 
the Fc*/Fc redox couple by adding recrystallized and purified ferrocene to the 
sample solution. 
SQUID magnetization measurements. Magnetism data of crystalline powdered 
samples were recorded with a SQUID magnetometer (Quantum Design) at 10 kOe 
(2-300K). Values of the magnetic susceptibility were corrected for the underlying 
diamagnetic increment (\dia = —694.04 x 10~°cm? mol™! for 2-OH) by using 
tabulated Pascal constants and the effect of the blank sample holders (gelatin 
capsule/straw)*°. Samples used for magnetization measurements were checked 
for chemical composition and purity by CH elemental analysis and 'H NMR 
spectroscopy. 
Single-crystal X-ray crystallographic analyses. Suitable single crystals of each 
sample of 2-OH as well as of [((4¢8"ArO)3tacn)U(OH)] - 2Et;O were embed- 
ded in protective perfluoropolyalkylether oil and transferred to the cold nitrogen 
gas stream of the diffractometer. Intensity data for both samples of 2-OH were 
collected using MoK, radiation (wavelength \ = 0.71073 A) ona Bruker Kappa 
APEX 2 IS Duo diffractometer, equipped with QUAZAR focusing Montel optics; 
intensity data for [((A*BArO)3tacn)U(OH)] - 2Et,O were collected using MoK,, 
radiation (\=0.71073 A, graphite monochromator) on a Bruker-Nonius Kappa 
CCD diffractometer. Data were corrected for Lorentz and polarization effects; 
semi-empirical absorption corrections were performed on the basis of multiple 
scans using SADABS (version 2014/4 (2014) for 2-OH from THEF/n-pentane and 
2-OH from THF; version 2.06 (2002) for [((448"ArO)3tacn)U(OH)] - 2Et,O; 
Bruker AXS, Inc.). The structures were solved by direct methods and refined by 
full-matrix least-squares procedures on F* using SHELXL-2014/6 (ref. 37). All 
non-hydrogen atoms were refined with anisotropic displacement parameters. 
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Complex 2-OH from THE/n-pentane crystallized with a total of three molecules 
of THF per formula unit. All three THF molecules were disordered. Two alternative 
orientations were refined in each case resulting in site occupancies of 26.4(7)% and 
73.6(7)% for the atoms O100-C104 and O110-C114, 43.6(10)% and 56.4(10)% for 
the atoms 0200-C204 and 0210-C214, and 49.1(9)% and 50.9(9)% for the atoms 
0300-C304 and O310-C314, respectively. SAME, SIMU, and ISOR restraints were 
applied in the refinement of the disorder. A tentative hydrogen position for the 
hydroxide H atom H5 could be determined from a difference Fourier synthesis. 
This hydrogen atom was treated using a riding model. All other hydrogen atoms 
were placed in positions of optimized geometry. The isotropic displacement param- 
eters of all hydrogens were tied to those of their corresponding carrier atoms by 
a factor of 1.2 or 1.5. 

In the crystal structure obtained for 2~OH from THE, the ligand was heavily 
disordered. Two alternative orientations of two out of the three ligand arms were 
refined giving site occupancies of 57.9(3)% and 42.1(3)% for the affected atoms 
(atom names of the minor fraction were denoted with an additional A). One of 
the adamantyl groups was disordered with a different ratio of the two orienta- 
tions giving site occupancies of 77.0(5)% and 23.0(5)% for the atoms C18-C25 
and C18A-C25A, respectively. The compound crystallized with a total of 3.42 
molecules of THF per formula unit. The partly occupied THF (0400-C404) is 
found in the place where the phenyl ring C29-C34 of the major fraction would 
be, but when the minor fraction of the ligand disorder is realized. SAME, SIMU, 
and ISOR restraints were applied. A tentative hydrogen position for the hydroxide 
H atom H5 could be determined from a difference Fourier synthesis, but, owing 
to both the heavy U atom and the strong disorder of the ligand, its positional 
parameters could not be refined. In the final refinement cycles, H5 was therefore 
placed in a position of optimized geometry and maximized electron density. All 
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other hydrogen atoms were also placed in positions of optimized geometry. The 
isotropic displacement parameters of all hydrogens were tied to those of their 
corresponding carrier atoms by a factor of 1.2 or 1.5. 

In the crystal structure obtained for [((A4B"4 rO)3tacn)U(OH)] - 2Et,O, one 
of the Bu groups of the ligand was disordered. Two alternative orientations were 
refined giving site occupancies of 80(2)% and 20(2)% for the affected atoms 
C67-C69 and C67A-C69A, respectively. The compound crystallized with two 
molecules of diethyl ether per formula unit, one of which was disordered. Two 
alternative orientations were refined giving site occupancies of 57.9(9)% and 
42.1(9)% for the atoms C201-C205 and C211-C215, respectively. SAME, SIMU, 
and, in part, ISOR restraints were used in the refinement of the disordered atoms. 
All hydrogen atoms including the hydroxide H were placed in positions of opti- 
mized geometry. The isotropic displacement parameters of all hydrogens were tied 
to those of their corresponding carrier atoms by a factor of 1.2 or 1.5. 
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Ice stream activity scaled to ice sheet volume during 
Laurentide Ice Sheet deglaciation 


C. R. Stokes!, M. Margold!+, C. D. Clark? & L. Tarasov? 


The contribution of the Greenland and West Antarctic ice sheets 
to sea level has increased in recent decades, largely owing to the 
thinning and retreat of outlet glaciers and ice streams‘. This 
dynamic loss is a serious concern, with some modelling studies 
suggesting that the collapse of a major ice sheet could be imminent** 
or potentially underway’ in West Antarctica, but others predicting a 
more limited response®. A major problem is that observations used to 
initialize and calibrate models typically span only a few decades, and, 
at the ice-sheet scale, it is unclear how the entire drainage network 
of ice streams evolves over longer timescales. This represents one of 
the largest sources of uncertainty when predicting the contributions 
of ice sheets to sea-level rise*"!°. A key question is whether ice 
streams might increase and sustain rates of mass loss over centuries 
or millennia, beyond those expected for a given ocean-climate 
forcing®'°. Here we reconstruct the activity of 117 ice streams that 
operated at various times during deglaciation of the Laurentide Ice 
Sheet (from about 22,000 to 7,000 years ago) and show that as they 
activated and deactivated in different locations, their overall number 
decreased, they occupied a progressively smaller percentage of the ice 
sheet perimeter and their total discharge decreased. The underlying 
geology and topography clearly influenced ice stream activity, but— 
at the ice-sheet scale—their drainage network adjusted and was 
linked to changes in ice sheet volume. It is unclear whether these 
findings can be directly translated to modern ice sheets. However, 
contrary to the view that sees ice streams as unstable entities that 
can accelerate ice-sheet deglaciation, we conclude that ice streams 
exerted progressively less influence on ice sheet mass balance during 
the retreat of the Laurentide Ice Sheet. 

Continental ice sheets are drained by a network of rapidly flowing 
ice streams with tributaries of intermediate velocity that extend far into 
their interiors'’. Towards the margins of modern ice sheets, many ice 
streams become confined within glacial troughs and are referred to as 
marine-terminating outlet glaciers, whereas others occupy regions of 
more subdued relief'”. Their large size (up to tens of kilometres wide 
and hundreds of kilometres long) and high velocity (up to thousands of 
metres per year) mean that they are an important mechanism through 
which ice is transferred to the ocean, and thereby affects the sea level. 
In contrast to climatically forced melting)’, ice stream dynamics could 
introduce considerable nonlinearity into the response of ice sheets to 
external forcing®’. This potential nonlinearity is viewed as the major 
source of uncertainty in assessments of future changes in ice sheets and 
sea level'®*"!° leading to the question of whether the drainage network 
of ice streams arises in response to climatically driven changes in ice 
sheet volume, or if it can evolve to drive changes beyond that which 
might be expected from climatic forcing alone. 

Building on a recent inventory" of 117 ice streams in the former 
North American Laurentide Ice Sheet (LIS; including the Innuitian Ice 
Sheet, but excluding the Cordilleran Ice Sheet) (Extended Data Fig. 1), 
we use the best-available ice margin chronology (based on about 4,000 
dates; Extended Data Fig. 2)'* to ascertain the timing of their activity 


during deglaciation (see Methods). Using the mapped extent of ice 
streams, we calculated the number of them operating over time and 
the percentage of the ice sheet perimeter that was streaming. We also 
explore their potential ice discharge during deglaciation, albeit with 
larger uncertainties, which we compare with output from numerical 
modelling of the ice sheet during deglaciation’®'” (see Methods). 

When the LIS was at its maximum extent approximately 22,000 years 
ago (22 kyr ago), ice streams formed a drainage network with a velocity 
pattern resembling that of modern-day ice sheets (Fig. 1a, b). Early 
during deglaciation, numerous ice streams were located in major top- 
ographic troughs and drained the marine-based sectors of the northern 
and eastern margins of the ice sheet for several thousand years (Fig. 1b, c). 
Ice streaming along the land-terminating margins was more transient, 
and we find that numerous ice streams switched on and off in different 
locations during retreat. This empirical assessment of the duration ofa 
large population of ice streams reveals that although some ice streams 
persisted for 5-10 kyr, about 40% operated for less than 2 kyr, with many 
(approximately 23%) operating for less than 0.5 kyr (Fig. 2). 

Although ice streams activated and deactivated in different locations, 
we find no evidence for any episodes when the number of ice streams 
increased substantially (Fig. 3a). Throughout the period 22-15.5 kyr 
ago, there were about 50 ice streams, but this number subsequently 
dropped rapidly (for example, approximately 13 kyr ago and 11.5 kyr 
ago), with fewer than 10 ice streams operating after 11.5 kyr. When nor- 
malized by ice sheet volume, the number of ice streams is remarkably 
stable (Fig. 3b), with approximately 2 ice streams per 1,000,000 km? of 
ice sheet volume for almost 10kyr. The collapse of the LIS into Hudson 
Bay after 8.5 kyr ago triggered a final flurry of ice stream activity, but 
ina very small ice sheet. 

At its maximum extent, approximately 27% of the LIS margin was 
streaming (Fig. 3c), which is very similar to that found for present-day 
Antarctica (Fig. 1a). This value decreased to between 25% and 20% 
from 16 kyr ago to 13 kyr ago, but then rapidly dropped to approxi- 
mately 5% about 11 kyr ago. Similarly, our order-of-magnitude esti- 
mates of Laurentide ice stream discharge show no obvious increases 
during deglaciation (Fig. 3d). Rather, this dynamic component of mass 
loss was relatively stable from about 22 kyr ago to 15 kyr ago (remaining 
at a value of about 1,500 km? yr), but then rapidly decreased to 
<400km? yr“! by about 11 kyr ago, and then to <100 km*yr~! about 
9kyr ago. When normalized by ice sheet volume, dynamic mass loss 
was relatively stable during the period 22-15 kyr ago, but then dropped 
about 13 kyr ago (Fig. 3e), before increasing temporarily as the ice sheet 
collapsed around Hudson Bay. 

A comparison with estimates of total ice stream discharge from a 
previously published numerical model of the North American Ice 
Sheet complex'® and inferences from surface mass balance model- 
ling at specific time steps!” indicates that model-derived Laurentide 
ice stream discharges are typically higher and more variable (Fig. 3f). 
Nevertheless, both empirical and modelled estimates show a decrease 
in ice stream discharge from about 15 kyr ago. Moreover, we find a 
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Figure 1 | Ice flow velocity of the Antarctic ice sheet compared (at the 
same spatial scale) with reconstructions of ice stream activity in the 
LIS at selected time steps. a, Present-day Antarctic ice sheet velocity", 
with red lines indicating where ice streams intersect the grounding line. 
b-d, Ice streams reconstructed for the LIS at the LGM (approximately 
21.8kyr ago) (b), 13.9 kyr ago (c) and 10.1 kyr ago (d), as labelled, with 
the radiocarbon (14C) dates given in parentheses. The extent of the LIS is 
indicated by the light blue shading and dark blue outline. The locations of 


clear link between ice sheet volume and both the number of ice 
streams and the percentage of the ice sheet perimeter they occupied 
(Fig. 4a, b). A similar scaling is seen in both modelled and empirically 
derived discharge (Fig. 4c), but we acknowledge there are much larger 
uncertainties in our estimates of Laurentide ice stream discharge. The 
relative effect of ice streaming is seen more clearly by plotting ice sheet 
volume against the total ice stream discharge normalized by the ice 
sheet volume (Fig. 4d); these data indicate that the relative role of mass 
loss from streaming was unlikely to have increased as the ice volume 
decreased during deglaciation. 

There are a number of factors that influence where ice streams 
develop, with previous work highlighting their strong association with 
topographic troughs, calving margins and soft sedimentary beds'®!*”, 
Topography exerted a strong control on ice stream location in the LIS, 
particularly during early deglaciation (22-14kyr ago) when its flow 
was steered by the major marine channels of the Canadian Arctic 
Archipelago and the high-relief coasts along the eastern margin. There 
is no glacial geomorphological evidence’ that these ice streams contin- 
ued to operate once the ice sheet lost its marine margin and retreated 
onto lower-relief terrain. Thus, topographic troughs and the marine 
margin clearly modulated the number of ice streams operating over 
time (Figs 1 and 3a). 


r — 
10.1 kyr ago 
(9 14C kyr ago) 


the Laurentide ice streams that were active at the given time are shown in 
blue and numbered in black, those that switched off within the preceding 
1kyr are shown in grey and those that switched on during the subsequent 
1 kyr are shown in dark blue with numbers in red. Orange and white 
dashed lines with arrows show ice stream flow direction. The ice streams 
are labelled according to their inventory number". The underlying 
topography is from GTOPO30 digital elevation data. 


Elsewhere, ice streams were abundant on low-relief areas that were 
underlain by soft sedimentary bedrock and thick sequences of till. This 
includes the western and southern margins of the ice sheet'*!°°, where 
numerous ice streams switched on and off during deglaciation, with 
marked changes in trajectory”’ (Fig. 1). These networks of sinuous ice 
streams deactivated as the ice margin withdrew onto the harder igneous 
and metamorphic rocks of the Canadian Shield, pointing to a geological 
control that explains the marked reduction in the number of ice streams 
from about 12 kyr ago (Fig. 3a, Extended Data Fig. 3). However, ice 
streams continued to activate over the low-relief crystalline bedrock of 
the Canadian Shield, with several large, wide (100-200 km) ice streams 
operating for very short periods (a few hundred years) during the final 
stages of deglaciation (beginning about 10 kyr ago; Fig. 1d; Extended 
Data Fig. 3)!07!, 

Although topography and underlying geology exerted an impor- 
tant influence on ice stream activity, we find no evidence for major ice 
sheet instabilities linked to ice stream activity that is reflected in the 
spatial re-organization of their drainage network (for example, marked 
increases in the number of ice streams, or individual ice streams widen- 
ing or enlarging during ice sheet retreat). Rather, we find that the over- 
all number of ice streams decreased and they occupied a progressively 
smaller percentage of the ice sheet perimeter. This finding implies that 
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Figure 2 | Duration of individual ice streams in the LIS. Grey-filled bars 
represent the best estimate of the duration of each ice stream; blue and red 
outlines represent data assuming the minimum and maximum duration, 
respectively, for all ice streams (see Extended Data Fig. 1 for ice stream 
locations). 


the final 4-5 kyr of deglaciation (beginning approximately 12 kyr ago) 
was largely driven by surface melt, which is corroborated by independ- 
ent modelling of the surface mass balance of the ice sheet!” and infer- 
ences based on the density of subglacial meltwater channels (eskers)”*. 
Specifically, modelling of the surface energy balance of the ice sheet!” 
suggests that a transition from a positive to a negative surface mass 
balance occurred between 11.5 kyr ago and 9 kyr ago, when much of the 
LIS retreat occurred at rates two to five times faster than before 11.5 kyr 
ago. In that study!’, volume losses not attributable to surface melting 
were assumed to be from dynamic discharge and, in broad agreement 
with our results (Fig. 3f), the modelling implies that dynamic discharge 


decreased from about 15.5 kyr ago. Our range of discharge estimates 
at the Last Glacial Maximum (LGM) (750-2,300 km’) and in the early 
Holocene (100-700 km? at 9 kyr ago) also fall within their inferred 
ranges!” (770-2,750 km? and 0-1,650 km’, respectively). The major 
difference is that this previous surface energy balance modelling'” 
suggests that dynamic discharge increased from the LGM to a 
maximum (4,290-4,620 km?) around the time of Heinrich event 1 
(H1; 15.5 kyr ago) when the modelled surface mass balance is largest. 
A positive mass balance is temporarily induced in the other ice sheet 
modelling'® shown in Fig. 3f to facilitate a large dynamic discharge from 
the Hudson Strait ice stream during H1. We do not depict such extreme 
discharge at this time because our approach is based on modern 
ice stream data that are unlikely to capture such extreme discharges. 
However, we find no obvious spatial reorganization” of ice streams 
during or immediately after H1. This finding suggests that H1 had 
a limited effect on the wider ice sheet drainage network, and points 
to extreme velocity fluctuations for specific ice streams (for example, 
Hudson Strait)*°, which we are unable to constrain, and/or mecha- 
nisms that do not invoke major collapses of ice sheets, such as ice shelf 
break-up*®. By contrast, we note some reorganization of ice streaming 
following, but not before or during, Meltwater Pulse 1A (that began 
about 14.6 kyr ago)?”. The saddle collapse that occurred during separa- 
tion of the Laurentide and Cordilleran ice sheets has been hypothesized 
to have contributed to this event”’ and we observe several short-lived 
ice streams in this region after the collapse, but with a concomitant 
decrease in ice stream activity along the southern margin (Fig. 1c). 

It is important to consider whether ice streaming in the LIS offers an 
analogue for modern-day ice sheets. Although the ocean-climate forc- 
ing would have been different during deglaciation of the LIS, there is no 
empirical evidence or theoretical reasoning to suppose that Laurentide 
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Figure 3 | Ice stream activity during deglaciation of the LIS. a, Number 
of active ice streams during deglaciation. The black line is the best estimate 
and the orange shading indicates the uncertainty in the age-bracketing 

of ice streams. The grey vertical bar shows the time when the ice sheet 
margin transitioned from a predominantly soft to a predominantly 

hard bed. Data are measured in intervals of 100 yr. b, Number of active 
ice streams normalized by LIS volume obtained from data-calibrated 
numerical modelling’®. Data are measured in intervals of 500 yr; line 

and shading as in a. c, Percentage of the ice sheet perimeter that was 
streaming. Data are measured in intervals of 500 yr; line and shading 

as ina. d, First-order estimate of the total Laurentide ice stream discharge 
based on a width-discharge regression from modern-day ice stream data 
(black line; see Methods). Orange shading indicates the uncertainty in 
both the age-bracketing of the ice streams and the discharge uncertainty 
from the 95% confidence intervals of the regression. For comparison, 
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brown line and grey shading shows the same range of discharges 
obtained from the regression without two obvious outliers (Extended 
Data Fig. 10). Dashed line shows discharge from a cruder, two-state 
approximation for the best estimate of ice stream duration (see Methods). 
e, Ice stream discharge (black line in d) normalized by LIS volume from 
numerical modelling’*; orange shading as in a. f, Empirical ice discharge 
(black line and orange shading in d) compared to ice stream discharge 
generated from the mean of a data-calibrated numerical modelling 
ensemble of the LIS!®. The light blue and green lines show ice stream 
discharge from grid cells on the margin of the grounded ice in which 

the modelled velocities were >500 m yr! and > 100 myr', respectively 
(both with 1o uncertainties). The dark blue line shows the discharge that 
is inferred from previous modelling of the surface mass balance of the ice 
sheet!’, with shading indicating uncertainty as propagated from surface 
mass balance results (see ref. 17). 
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Figure 4 | Indicators of Laurentide ice stream activity plotted against 
ice sheet volume. a, Number of active ice streams. Black dots show 
results using best estimates of ice stream duration with vertical lines 
indicating the uncertainty in the age-bracketing. b, Percentage of the ice 
sheet perimeter that was streaming (symbols as in a). c, Total ice stream 
discharge as determined from our empirical calculations (black dots 
show best estimate and vertical lines indicate both the age and discharge 
uncertainties from Fig. 3d) and from numerical modelling’® (blue dots 
show mean and vertical lines show 2o uncertainties). Modelled discharge 
is extracted from grid cells on the margin of the grounded ice in which 
the modelled velocities were >500 m yr! (blue line in Fig. 3f). Data 

for the present-day ice sheets in Greenland (purple diamond), West 
Antarctica (pink triangle), East Antarctica (orange triangle) and West 
and East Antarctica combined (red triangle) are also plotted using recent 
ice stream discharge”? and ice sheet volume*””’ estimates. d, Total ice 
stream discharge as determined from modelled and empirical estimates 
in c normalized by ice sheet volume. In all panels, ice sheet volume is 
derived from the mean of a best-performing ensemble of previously 
published data-calibrated numerical models'® (see Methods). The low 
modelled streaming fraction at volumes of about 2.4 x 10” km? (c, d) are 
due to the dynamic facilitation of H1 in that modelling (see Methods). 
Empirical discharges for small ice volumes (<0.2 x 10’ km?) have highly 
under-represented uncertainties, owing to higher uncertainties in ice sheet 
volume close to final deglaciation. 
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ice streams should behave in a fundamentally different manner. Our 
reconstructed pattern of ice streams at the LGM is remarkably similar 
to the velocity pattern of the Greenland and Antarctic ice sheets, and 
we note that Laurentide ice streams drained a similar proportion of the 
ice sheet perimeter when it was a similar size to present-day Antarctica 
(Fig. la, b). Large sectors of the LIS occupied similar physiography to 
modern-day ice sheets, with ice streams exhibiting a similar size, shape 
and spatial organization along its marine margins. The most obvious 
difference is that the LIS retreated onto a low-relief, hard bedrock ter- 
rain and had ice streams that terminated on land and produced large, 


low-relief lobes along much of the southern and western margins!?”°. 


Although these have been likened to some West Antarctic ice streams!”, 


they have no modern-day analogue. However, despite the fact that all 
modern-day ice streams are marine-terminating, large parts of the 
Greenland and East Antarctic ice sheets will have land-based margins if 
they continue to deglaciate”**", which might be within a few millennia 
in Greenland’. Our analysis confirms that the geology and topography 
over which modern-day ice sheets retreat will be a key determinant 
of where ice streams are likely to activate and deactivate®. However, 
we also find a strong dependence between ice sheet volume and ice 
stream activity that also holds for modern-day ice sheets (Fig. 4c), and 
which hints at a more regulatory role in ice sheet dynamics than was 
previously recognized. This result does not preclude instabilities on 
decadal to centennial timescales*’, but suggests that if modern-day 
ice sheets continue to deglaciate, then ice streams are likely to switch 
off, and their relative contribution to mass loss may decrease over 
several millennia, with final deglaciation accomplished most effectively 
by surface melt!””, 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Identifying Laurentide ice stream locations. An ice stream is a region in a 
grounded ice sheet that flows much faster than the regions on either side*!. Where 
the fast-flowing ice becomes bordered by exposed rock (for example, in high-relief 
fjord landscapes), they are usually referred to as marine-terminating outlet glaciers. 
These outlet glaciers typically initiate as ice streams and so we use the term ‘ice 
stream’ throughout, but include outlet glaciers. 

We use a recently published inventory" of ice streams in the Laurentide Ice 
Sheet (LIS), which includes 117 ice streams (Extended Data Fig. 1). These were 
identified on the basis of previously published evidence, complemented with new 
mapping using satellite imagery and Digital Elevation Models (DEMs) on land, 
and bathymetric data and swath bathymetry for submerged areas'“. The systematic 
nature of the new mapping from across the entire ice sheet bed means it is very 
unlikely that any major ice streams have been missed"*. 

Ice streams are easily distinguishable on an ice sheet bed from a variety of 
evidence that is now well established in the literature!#?°7!>?-, Their spatially 
discrete, enhanced flow creates a distinctive bedform imprint that is immediately 
recognizable and characterized by several geomorphological criteria”. These 
include highly elongated subglacial bedforms (mega-scale glacial lineations™; 
Extended Data Figs 4 and 5), which have also been observed beneath modern-day 
ice streams*°, and which typically exhibit convergent flow patterns towards a main 
ice stream ‘trunk. These landform assemblages are often characterized by abrupt 
lateral margins that border areas with much shorter subglacial bedforms, or no 
bedforms at all327936374243 (Extended Data Fig. 5). In some cases, the abrupt mar- 
gin is marked by features known as ice stream shear margin moraines***3378 
(Extended Data Fig. 5). This landform evidence is readily identifiable on satellite 
imagery and aerial photographs*****° >? and, in some cases, is further aug- 
mented by field investigations that reveal discrete (Boothia-type)°? erratic dispersal 
trains or sedimentological evidence from tills that may have been overridden and/ 
or deformed by rapid ice flow. However, there are no ice streams in the inventory 
that are there only on the basis of sedimentological data and most (>85%) have a 
clear bedform imprint". 

At a larger spatial scale, it is known that many ice streams are steered by the 
underlying topography, often forming marine-terminating outlet glaciers bordered 
by rock walls“. The inventory includes these topographic ice streams, many of 
which were identified by major cross-shelf troughs and their associated sediment- 
ary depocentres‘® (Extended Data Fig. 6). Swath bathymetry from within these 
troughs commonly reveals many of the geomorphological criteria*” described 
above, such as mega-scale glacial lineations***>”. 

Given previous work that highlights the importance of ‘soft’ bedrock geology 

in influencing ice stream location’®"*", we also analysed the type of bedrock over 
which each ice stream was located. We categorized their underlying geology as 
either (i) predominantly ‘soft’ sedimentary rocks, (ii) predominantly ‘hard’ crys- 
talline rocks (intrusive, metamorphic and volcanic rocks) or (iii) those where the 
spatial footprint of the ice stream extended over a mixture of both soft and hard 
rocks. This allowed us to calculate the number of different types of ice streams in 
each broad geological category over time (Extended Data Fig. 3). 
Dating Laurentide ice streams. We used the best-available pan-ice sheet margin 
chronology’ to bracket the age of the spatial footprints of each ice stream in the 
inventory! The ice margin chronology includes 32 time steps, starting 21.8kyr 
ago (18 MC (radiocarbon) kyr ago) and ending 5.7 kyr ago (5 M4C kyr), based on 
>4,000 dates that are spread across the entire ice sheet bed (Extended Data Fig. 2). 
The database consists of mainly radiocarbon dates, supplemented with varve 
and tephra dates, which constrain ice margin positions and shorelines of large 
glacial lakes. Dates on problematic materials (for example, marl, freshwater shells, 
lake sediment with low organic carbon content, marine sediment, bulk samples 
with probable blended ages, and most deposit-feeding molluscs from calcareous 
substrates) were excluded. Marine-shell dates, a major component, were adjusted 
for regionally variable marine-reservoir effects on the basis of a large new set of 
radiocarbon ages on live-collected, pre-bomb molluscs from Pacific, Arctic and 
Atlantic shores. We use a mixed marine and Northern Hemisphere atmosphere 
calibration curve, whereas a IntCal98 calibration curve was used in ref. 15. 

We used the ice margin chronology to bracket the duration of ice stream activity 
using methods employed in previous work on individual or small numbers of ice 
streams?!*°4!- (Extended Data Fig. 7). In some cases, the duration of ice stream- 
ing might have been short-lived (only a few hundred years), leaving evidence of a 
simple ‘rubber-stamped’ imprint® of their activity, the spatial extent of which can 
be readily matched to just one or two ice margin positions (Extended Data Fig. 7a). 
The more complex landform assemblages of other ice streams (with overprinted 
mega-scale glacial lineations linked to associated ice marginal features) clearly 
indicate that they continued to operate during ice margin retreat (Extended Data 
Fig. 7b); therefore, we fitted the ice stream activity to a series of ice margin positions 
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over a longer time span (hundreds to thousands of years). Similar patterns are seen 
for marine-terminating ice streams’? (Extended Data Fig. 7c, d). 

To account for the inherent uncertainties in the dating (and interpolated ice 

margin position) and the spatial extent of each ice stream, we provide a maximum 
possible duration and a minimum duration for each ice stream in the inventory 
(Fig. 2). In cases in which the interpolated ice margin positions indicated a very 
short duration of ice streaming, we set the minimum duration to 100 yr. We chose 
this value because the creation of subglacial bedforms that permit the identifica- 
tion of ice streams probably occurs on timescales of the order of decades**, and 
attempting to date to a higher precision is meaningless given the dating uncertain- 
ties (mainly radiocarbon) and our focus on millennial-scale changes throughout 
deglaciation. 
Estimating Laurentide ice stream discharges. Unfortunately, there is no direct way 
to empirically reconstruct the velocity, and thus discharge, of an ice stream from the 
evidence it left behind. To provide a simple, first-order estimate of the potential 
ice discharge from each ice stream for which only the width is known confidently, 
we used an empirical relationship between the with and the discharge of 81 active 
ice streams, 50 in Antarctica (Extended Data Fig. 8) and 31 in Greenland (Extended 
Data Fig. 9). Ice velocities (in units of metres per year) were extracted from recent 
compilations in Greenland (2008-2009)**° and Antarctica (2007-2009)!!45 
and we used these velocity data sets to measure the width (in kilometres) of the 
ice stream (to the lateral shear margins or exposed rock walls) at the grounding 
line”, Velocity was extracted as a width-averaged value. We then used the highest 
resolution bed data that was available for Greenland” and Antarctica” to calculate 
the cross-sectional area (in square kilometres) of each ice stream at the grounding 
line. We calculated the modern ice stream discharge (in cubic kilometres per year) 
by multiplying the velocity data by the ice-thickness data and integrating the output 
along the width of the ice stream at the grounding line. 

When ice stream data from Antarctica and Greenland are amalgamated 
(Extended Data Fig. 10), a simple linear regression reveals a weak correlation 
(R? = 0.39) between ice stream width and discharge, which we used to predict an 
order-of-magnitude discharge from the width of each Laurentide ice stream that 
was active during deglaciation at each dated margin position (Fig. 3d). The regres- 
sion is clearly influenced by two outliers with extremely high discharge (Pine Island 
Glacier and Thwaites Glacier, West Antarctica). Without them, the correlation 
weakens (R? = 0.31) and our discharge estimates show the same trend, but absolute 
discharges are lower (see grey shading in Fig. 3d). We considered removing them 
from the regression, but chose not to and used them in our estimates of Laurentide 
ice stream discharge (for example, in Figs 3e, f and 4c, d) because they allow us 
to partly capture some of the more extreme discharges that might be expected 
in a deglaciating ice sheet. We also extracted the 95% confidence intervals of the 
regression and used these to estimate a lower and upper range of discharge for an 
ice stream of given width. However, these confidence intervals under-represent 
the uncertainty because some assumptions for those confidence intervals (and 
the general validity of linear regression) are broken: (i) Gaussian noise, (ii) no 
correlation between residuals of individual data points and (iii) constant variance. 
Given this, and the clear (and perhaps not surprising) complexity of the relation- 
ship between discharge and width for modern-day ice streams (for example, the 
mean value of linear ice stream flux for Greenlandic ice streams is very different 
from that for Antarctic ice streams, for which there is a stronger relationship), we 
also extracted a discharge relationship with a cruder, two-state approximation that 
avoids the assumptions required for statistically robust application of linear regres- 
sion. The modern-day ice stream data are from one short time period, and yet we 
know that ice streams with a fixed width can accelerate (and decelerate) on short 
(annual—decadal) timescales?~*; however, the extent to which these accelerations 
and decelerations are sustained over longer (centennial-millennial) timescales is 
currently unknown. Therefore, we use the simple approach to generate an empir- 
ical order-of-magnitude estimate of ice stream discharge from the LIS, averaged 
over millennial timescales, and note that our empirical results are broadly similar 
to those generated by numerical modelling'®’” (Fig. 3f), albeit typically lower. 

To evaluate our empirical estimates of ice stream discharge in relation to ice 
sheet volume (see, for example, Figs 3d, e and 4), we extracted ice sheet volume 
from the mean of an ensemble of best-performing model runs from a previ- 
ously published data-calibrated numerical model!®. Uncertainties associated 
with the modelled ice volumes (see ref. 16) are an order of magnitude less than 
those associated with our estimates of ice stream discharge and are not shown 
(for example, in Fig. 4). We use this same model to compare our empirical esti- 
mates of ice stream discharge against those generated in a numerical model of the 
LIS, with streaming discharge extracted from an ensemble of best-performing 
model runs at 100-yr time steps during deglaciation from 21.8 kyr ago to 5.7 kyr 
ago and ensemble standard deviation in ice stream discharge shown as shading 
around the mean (Fig. 3f). The weighted ensemble mean from this model shows 
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a similar trend of decreasing discharge from ice streams (Fig. 3f), but with higher 
discharges and greater variability. This is to be expected because our estimates 
based on modern-day ice stream discharges may not capture the full range of ice 
stream behaviour during deglaciation of a mid-latitude ice sheet (for example, 
we have no modern analogue of an extensive land-terminating margin overlying 
soft sediments). The numerical modelling also imposes a data-calibrated reduc- 
tion in ice stream discharge around the Hudson Strait region just before H1 to 
facilitate a dynamic destabilization during this event. This imposed reduction is 
reflected in the reduced ice stream discharge in that model for a few thousand years 
before 17 kyr ago and a temporary increase thereafter (Figs 3f and 4c, d). It is also 
reflected in the low modelled streaming fraction at volumes of about 2.4x 10’ km? 
(see Fig. 4d). 
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Extended Data Figure 1 | Location of 117 ice streams from a recently show ice stream flow direction. Modern-day ice velocity is shown 
compiled inventory based on previous work and systematic mapping for Greenland***’. Underlying topography from GTOPO30 digital 


across the LIS bed. Ice streams are shaded dark blue and numbered elevation data”. 
according to their inventory'* number. Orange dashed lines with arrows 
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Extended Data Figure 2 | Distribution of dates and interpolated ice red line shows the updated LGM ice margin (following recent work****), 
margin positions. These ice margin positions’ (thin red lines) are Underlying topography from GTOPO30 digital elevation data‘”. ma.s.L, 
based on dates (black dots) that we used to bracket the age of the spatial metres above sea level. 
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Extended Data Figure 3 | Number of Laurentide ice streams and the (cal kyr Bp refers to thousands of calendar years before present). 
percentage of the margin they drained over time, classified according to _b, However, several large, wide ice streams were active over the hard bed 
their underlying geology. a, A rapid decrease in the number ofice streams _ geology (see, for example, refs 10, 21), and they drained a large percentage 
is observed after about 12 kyr ago (Fig. 3a), which is linked with the retreat of the perimeter of the ice sheet. 

of the LIS onto the hard crystalline rocks of the Canadian Shield!” 
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Extended Data Figure 4 | Mega-scale glacial lineations** on the bed d, Similar features have been detected beneath Rutford ice stream, West 
of the Dubawnt Lake ice stream”’, central Canada. a-c, These features Antarctica*>. Landsat imagery courtesy of the US Geological Survey Earth 
are a characteristic geomorphological signature of ice streaming and are Resources Observation Science Centre; photograph by C.R.S. Images in 
readily identifiable on Landsat satellite imagery (a, c) and oblique aerial cand d modified from ref. 35. 


photography (b) of the ice stream bed (number 6 in Extended Data Fig. 1). 
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Extended Data Figure 5 | Landsat imagery of lateral shear margin Island (b; number 11 in Extended Data Fig. 1). Note the abrupt lateral 
moraines in the Canadian Arctic Archipelago. a, b, Beds of the margins (marked by white arrows) of the assemblage of mega-scale glacial 
MClintock Channel ice stream***”, Victoria Island (a; number 10 in lineations that are, in places, marked by lateral shear margin moraines**. 


Extended Data Fig. 1) and the Crooked Lake Ice Stream*’, Prince of Wales 
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Extended Data Figure 6 | Bathymetric data showing cross-shelf troughs —_ area in a) and a close-up view of one particular region where subglacial 
and a well-preserved bedform imprint from a submarine setting. bedforms are well developed (c; boxed area in b). High-resolution 
a, Cross-shelf troughs formed by ice streams fed by convergence of ice flow swath bathymetry data are from IBCAO* (a), and from IBCAO*™ and 
from several fjords along the east coast of Baffin Island. b, c, Drumlins ArcticNet*® (b, c). Figure redrawn from ref. 57. 


and mega-scale glacial lineations on the floor of Eclipse Sound (b; boxed 
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Extended Data Figure 7 | See next page for figure caption. 
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Extended Data Figure 7 | Schematic demonstrating the method used 
to bracket the age of the spatial footprint of Laurentide ice streams 

in both terrestrial and marine settings. These methods have been used 
extensively in previous work, but usually on small samples of ice streams 
(see, for example, refs 20, 21, 36, 37, 42, 43). a, In some cases, terrestrial 
ice streams are active, but then deactivate (switch off) as the ice margin 
retreats, which enables them to be bracketed between a small number of 
dated ice margins and implies a short duration of operation. b, In other 
cases, ice streams remain active during deglaciation and continually 
remould their landform assemblage, leaving a more complicated, time- 
integrated landform record, often with a series of overprinted landforms, 
which implies a longer duration of operation. c, d, The same scenarios 
as in a and b, respectively, but for a topographically controlled marine- 
terminating ice stream. 
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Extended Data Figure 8 | Location of ice streams in Antarctica where the ice streams (Extended Data Fig. 10), which we used to estimate 
discharge was estimated. Discharge was estimated from existing data sets __ the discharge of Laurentide ice streams for which we know only their 
of velocity''*, grounding-line position®® and ice thickness*”. Regression width (see Methods). 

analysis reveals a weak relationship between the width and discharge of 
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Extended Data Figure 9 | Location of ice streams in Greenland where discharge of the ice streams (Extended Data Fig. 10), which we used to 
discharge was estimated. Discharge was estimated from existing data estimate the discharge of Laurentide ice streams for which we know only 
sets of velocity****, grounding-line position*** and ice thickness”. their width (see Methods). 

Regression analysis reveals a weak relationship between the width and 
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Extended Data Figure 10 | Relationship between ice stream discharge (Greenland, green dots)**° and 2007-2009 (Antarctica, blue dots)'"6°8. 
and width for 81 active ice streams in Antarctica and Greenland. Grey shading shows the 95% confidence interval of the linear regression. 


Discharge calculations derived from velocity data from 2008-2009 Measured ice stream locations are shown in Extended Data Figs 8 and 9. 
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Moralistic gods, supernatural punishment and the 
expansion of human sociality 


Benjamin Grant Purzycki!, Coren Apicella”, Quentin D. Atkinson**, Emma Cohen®°®, Rita Anne McNamara’, AiyanaK. Willard®, 


Dimitris Xygalatas?°"", Ara Norenzayan’ & Joseph Henrich”! 


Since the origins of agriculture, the scale of human cooperation 
and societal complexity has dramatically expanded’”. This fact 
challenges standard evolutionary explanations of prosociality 
because well-studied mechanisms of cooperation based on 
genetic relatedness, reciprocity and partner choice falter as people 
increasingly engage in fleeting transactions with genetically 
unrelated strangers in large anonymous groups. To explain this 
rapid expansion of prosociality, researchers have proposed several 
mechanisms*”. Here we focus on one key hypothesis: cognitive 
representations of gods as increasingly knowledgeable and punitive, 
and who sanction violators of interpersonal social norms, foster and 
sustain the expansion of cooperation, trust and fairness towards 
co-religionist strangers” °. We tested this hypothesis using extensive 
ethnographic interviews and two behavioural games designed 
to measure impartial rule-following among people (n= 591, 
observations = 35,400) from eight diverse communities from around 
the world: (1) inland Tanna, Vanuatu; (2) coastal Tanna, Vanuatu; 
(3) Yasawa, Fiji; (4) Lovu, Fiji; (5) Pesqueiro, Brazil; (6) Pointe aux 
Piments, Mauritius; (7) the Tyva Republic (Siberia), Russia; and (8) 
Hadzaland, Tanzania. Participants reported adherence to a wide 
array of world religious traditions including Christianity, Hinduism 
and Buddhism, as well as notably diverse local traditions, including 
animism and ancestor worship. Holding a range of relevant variables 
constant, the higher participants rated their moralistic gods as 
punitive and knowledgeable about human thoughts and actions, 
the more coins they allocated to geographically distant co-religionist 
strangers relative to both themselves and local co-religionists. Our 
results support the hypothesis that beliefs in moralistic, punitive 
and knowing gods increase impartial behaviour towards distant 
co-religionists, and therefore can contribute to the expansion of 
prosociality. 

Among the other factors*~*” that influence the emergence of human 
ultrasociality and complex societies, the diffusion of explicit beliefs in 
increasingly moralistic, punitive and knowledgeable gods may have 
played a crucial role’. People may trust in, cooperate with and inter- 
act fairly within wider social circles, partly because they believe that 
knowing gods will punish them if they do not. Additionally, through 
increased frequency and consistency in belief and behaviour sets, 
commitments to the same gods coordinate people's expectations about 
social interactions® °. Moreover, the social radius within which people 
are willing to engage in behaviours that benefit others at a cost to them- 
selves may enlarge as gods’ powers to monitor and punish increase’”. 
To account for the emergence of these patterns, some evolutionary 
approaches to religion have theorized that cultural evolution may have 


harnessed and exploited aspects of our evolved psychology, such as 
mentalizing abilities, dualistic tendencies and sensitivity to norm com- 
pliance, to gradually assemble configurations of supernatural beliefs 
that promote greater cooperation and trust within expanding groups, 
leading to greater success in intergroup competition. Of course, given 
that cultural evolution can produce self-reinforcing stable patterns of 
beliefs and practices, these supernatural agent concepts may also have 
been individually favoured within groups due to mechanisms related to 
signalling, reputation and punishment?*! 112. Over time, these deities 
spread culturally and came to dominate the modern world religions 
like Christianity, Isam and Hinduism. Such traditions eventually came 
to account for a large proportion of the world’s population®”!>4 (see 
Supplementary Information section $1). Here we directly test one spe- 
cific hypothesis: conceptions of moralistic and punitive gods that know 
people's thoughts and behaviours promote impartiality towards distant 
co-religionists, and as a result contribute to the expansion of sociality. 

At the societal level, several lines of converging evidence are con- 
sistent with this hypothesis. For example, after controlling for key 
correlates, analyses of cross-cultural data sets show that larger and 
more politically complex societies tend to have more supernatural 
punishment and moralistic deities*!°, and historical analyses in one 
geographic region show that precursors to supernatural punishment 
beliefs precede social complexity!®. However, this data derives from 
qualitative ethnographies of entire societies; a more focused, direct and 
systematic cross-cultural assessment of what individuals think their 
gods care about, and whether or not people explicitly or implicitly view 
their gods as concerned with norms of interpersonal social behaviour 
(termed here as ‘morality’!”!8; see Supplementary Information section 
$4.2) has only recently begun'®°. Analyses of cross-national databases 
(for example, the World Values Survey) reveal positive relationships 
between beliefs in hell, beliefs in gods’ power to punish, and various 
self-reported prosocial behaviours”). Although valuable, these lines 
of research primarily rely on survey questions not specifically designed 
to address the research question we are interested in. Moreover, they 
rely on samples drawn broadly from nation states, thus excluding small- 
scale societies that are crucial for assessing questions about the expan- 
sion of prosociality. 

At the individual level, two types of behavioural studies are also 
consistent with this hypothesis, but each has crucial limitations. First, 
laboratory experiments show that exposure to religious remind- 
ers increases generosity and decreases cheating among religious 
believers’*-*°. However, as is the case for most psychological experi- 
ments, the vast majority of these studies rely on Western, Christian- 
majority samples, limiting their generalizability”®. Second, in one 
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Table 1 | Site descriptive statistics 


Site Researcher Economy Moralistic god Local god or spirit n Females Age Material insecurity 
Coastal Tanna’ Atkinson Horticulture Christian god Garden spirit (Tupunus) 44 23 35.02 (14.13) 0.22 (0.36) 
Hadza Apicella Hunting Celestial figure (Haine)* Sun (/shoko)* 68 31 39.82 (12.08) 0.82* (0.36) 
Inland Tanna$ — Atkinson Horticulture Kalpapan (traditional) | Garden spirit (Tupunus) 76 38 37.00 (16.17) 0.26 (0.38) 
Lovu Willard Wage labour Hindu Bhagwan None available 76 52 44.56 (16.94) 0.83 (0.33) 
Mauritius Xygalatas Wage labour and farming Hindu Shiva Spirit/soul/ghost (Nam) 94 27 36.56 (15.05) 0.39 (0.35) 
Pesqueiro Cohen Wage labour Christian god Virgin Mary 77 40 34.12 (13.08) 0.86 (0.24) 
Tyva Republic = Purzycki Wage labour and herding Buddha Burgan Spirit-masters (Cher eez/) 81 58 33.53 (12.52) 0.47 (0.28) 
Yasawa McNamara_ Fishing and farming Christian god Ancestor spirits (Kalou-vu) 75 41 38.04 (15.91) 0.50 (0.40) 
Grand mean 73.88 - 37.34 0.55 


Means indicated (standard deviations in parentheses). See Extended Data Fig. 1 for a map of field sites. 


§One individual was removed from the local co-religionist game due to coin visibility. 
#These two gods closely overlap in conception. 
+Answer options were “yes”, “no” or “| don’t know”. 


field study”’ across 15 diverse societies of foragers, pastoralists and 
horticulturalists, adherence to Christianity or Islam predicted greater 
fairness in economic games relative to adherence to local/traditional 
religions. This study, however, lacked precise measures for our theoreti- 
cally important components of beliefs about gods’ minds—punishment, 
knowledge and moralism. Moreover, these studies did not consider the 
religious affiliation of the anonymous recipients of players’ monetary 
decisions. It is therefore unclear whether these findings explain the 
expansion of prosociality specifically towards geographically distant 
co-religionists. 

Addressing these limitations, we combined two behavioural 
experiments with detailed ethnographic interviews to assess whether 
participants who report that their moralistic gods are punishing and 
more knowledgeable about human thought and behaviour are more 
likely to impartially allocate money to anonymous, geographically 
distant co-religionists over both themselves and their local commu- 
nity®”. In five of the sites, we also tested whether religious priming 
associated with moralistic gods had effects on gameplay, but these 
had no overall effect (see Supplementary Information sections $2.2.2 
and S6.2. 

We tested these predictions with a sample of 591 participants (310 
females; observations = 35,400; Table 1 and Extended Data Fig. 1) from 
eight diverse communities, including hunter-gatherers, horticulturalists, 
herders and farmers, as well as fully market-integrated populations 
engaged in wage labour or operating small businesses. The participants 
adhere to a variety of world religious traditions including Christianity, 
Hinduism and Buddhism, and report beliefs in an immense range of 
local supernatural agents, including spirit-masters, saints, ancestors, 
animistic beings, anthropomorphic celestial deities, garden spirits, and 
ghosts (Supplementary Information section S3). 

To measure favouritism towards oneself and local community under 
maximally anonymous conditions, we modified the random alloca- 
tion game””®”?, In this game (Fig. 1), participants play in private with 
30 coins, two cups and a fair die with three sides of one colour and 
three sides of another colour. In the experiment, the participant's job 
is to allocate each coin to one of the two cups. First, they mentally 
choose one of the cups and then roll the die. If one coloured side comes 
up, players are instructed to put the coin into the cup they mentally 


i a i a 
Distant 
co-religionist 


Self game 


Local Distant 
co-religionist, co-religionist 


Local co-religionist game 


Figure 1 | The random allocation game. a, b, Generic game setup (a) and 
variants used in present work (b). 
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chose. If the die comes up the other colour, people are instructed to 
put the coin into the opposite cup from the one they chose. Of course, 
as cup selection occurs only mentally, participants can overrule the 
die in favour of one of the cups without anyone else observing their 
decision. If people play by the rules and thereby allocate the coins 
impartially, the mean number of coins in each cup should be 15, and 
the distribution around this average will be binomial. This allows us to 
test for systematic deviations from this distribution (Supplementary 
Information section $2.2). 

Participants played two counterbalanced games for a total of 60 coin 
allocations per person (Fig. 1). In one game, the local co-religionist 
game, participants chose between a cup assigned to an unspecified 
anonymous co-religionist from their local community and a cup 
assigned to an anonymous co-religionist living in a geographically 
distant community that does not regularly interact with the player's 
community. In the other game, the self game, participants chose 
between a cup for themselves and a cup for another anonymous distant 
co-religionist. In order to control for any effects of ethnicity*® 
and nationality, both local and distant co-religionists were of the same 
ethnic group and nationality as the participant. 

Participants understood that money put into the cups would be 
given to the people they represented, including themselves, and we 
actually distributed allocations to participants and randomly selected 
people described by the cups (that is, there was no deception). After 
gameplay, we asked each participant a battery of questions, including 
a series of counterbalanced questions about two locally relevant deities 
(Supplementary Information section $2). 

To assess the gods’ relative moral concern, we conducted prelimi- 
nary ethnographic interviews in each site to identify the most moralistic 
deities (that is, ‘moralistic gods’), as well as locally salient, relatively less 
moralistic, ‘local gods’ or spirits. We verified the degree to which gods 
care about morality with a free-list task asking about gods’ concerns’® 
and scales created to measure how important participants claim pun- 
ishing theft, murder and deceit are to these supernatural beings. We 
measured gods’ punishment and knowledge, using the mean of two, 
two-item, easy-to-understand scales with dichotomous responses. The 
target gods associated with games were rated significantly more moralis- 
tic, knowledgeable and punitive than local gods (see Extended Data Figs 
2 and 3; Supplementary Information section S4). We also aggregated 
gods’ punishment and knowledge scores by averaging all four dichoto- 
mous responses, labelled ‘punishment-knowledge combined’ in Table 2. 
These measures are our key theoretical predictors for game allocations. 

Figure 2 displays the effect of punishment for moralizing gods, with- 
out any controls, and reveals the impact of “I don’t know” answers 
which were otherwise excluded from our analyses below. When peo- 
ple report not knowing if a god punishes, they put considerably fewer 
coins in the cups for distant co-religionists in both games (local co- 
religionist game: M = 12.97, s.d. = 4.33; self game: M= 12.50, s.d.=4.15) 
than those who consistently report that their god punishes (local 
co-religionist game: M = 14.58, s.d. = 3.24; self game: M = 14.53, 
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Table 2 | Log odds ratios for predicting allocations to distant 
co-religionists with 95% confidence intervals from our main 
binomial logistic regression models 


Local Local 

co-religionist co-religionist 
Variable game game Self game Self game 
Moralistic gods’ 1.15 - L1i - 
punishment (1.03, 1.27)" (1.00, 1.23)+ 
Moralistic gods’ 1.17 - 1.22 - 
knowledge (1.00, 1.36)* (1.05, 1.42)* 
Punishment- - 1.20 - 1.23 
knowledge (1.04, 1.37)* (1.07, 1.41)** 
combined 
Material 1.02 1.01 0.98 0.96 
insecurity (0.92,1.12) (0.92,1.11)  (0.89,1.08) (0.88, 1.06) 
Number of 0.98 0.99 0.98 0.98 
children (0.96, 1.00)+ (0.97,1.01) (0.96,1.00)* (0.96, 1.00)* 
n 503 519 504 520 
Observations 15,090 15,570 15,120 15,600 


All models in this table include field site and additional control variables as fixed effects. For all 
punishment-knowledge aggregate models, see Extended Data Table 1 (highlights from models 1 
and 4 presented here) and Supplementary Table S9. See Supplementary Tables S5 and S6 for all 
other models (highlights from models 1 FE presented here). Odds ratios >1 indicate greater odds 
in allocating a coin to the distant co-religionist. **P<0.01, *P<0.05, tP<0.10. 


s.d. = 3.31). One way to estimate the magnitude of these effects is to 
calculate the quotient of deviations from the ideal impartial allocation 
of 15. Compared to those who don't know, claiming the moralizing 
god punishes increases allocations towards distant co-religionists in 
the self game by a factor of 4.8 and in the local co-religionist game by 
a factor of 5.3. Extended Data Figs 4 and 5 detail the overall allocation 
distributions for both games. 

We explored this relationship in more detail by regressing the num- 
ber of coins allocated to the distant co-religionist cup on a host of var- 
iables for each game in a large set of binomial regressions (Extended 
Data Table 1 and Supplementary Information section S6). Table 2 
shows a subset of the key predictors for the models with the largest 
set of control variables, including a number of economic and demo- 
graphic variables such as education, material insecurity, number of 
children and field site fixed effects. Using sites as fixed effects allows 
us to remove the variation between our sites, so the results in Table 2 
only capture the differences among individuals within sites. Based on 
previous work®”’, we suspected that material insecurity and number 
of children would increase self and local favouritism, and therefore 
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Figure 2 | Allocations to distant co-religionists increase as a function of 
moralistic gods’ punishment. Punishment indices are mean values of a 
two-item scale (see Supplementary Information section $2.3.2). Error bars 
represent bootstrapped (1,000 replications) 95% confidence intervals of the 
mean. Histogram labels are sample sizes per category. Note that among the 
32 individuals who responded “I don't know’ to the questions pertaining to 
moralistic gods’ punishment, 17 were Hadza and 15 were inland Tannese. 
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Figure 3 | Log odds ratios with 95% confidence interval plots of the 
influence of key variables on the odds that a coin goes into the cup for 
the distant co-religionist. Odds ratios >1 indicate an increase and odds 
ratios <1 indicate a decrease in the odds of allocating a coin to the distant 
co-religionist (***P < 0.001, **P <0.01; *P< 0.05; #P <0.15). The x axis 
is on a logarithmic scale. Both models include other controls (n = 390). 
Local co-religionist and self results include sites as fixed effects. Note that 
Indo-Fijians are not included in these models due to the lack of data for 
local gods. See Supplementary Tables S5 and S6 for full models (models 
2FE are presented here). 


we include both in our model (Supplementary Information section 
S2.3.1). To affirm the robustness of these analyses, we estimated many 
alternative models, formulated mixed models, and used both alterna- 
tive standard error estimates and different approaches to modelling 
the error (Supplementary Information section $5.4). Across a wide 
range of specifications and models including a host of variables (for 
example, divine rewards, emotional closeness to distant co-religionists, 
among others), both moralistic gods’ punishment and knowledge, as 
well as our aggregate punishment-knowledge variable, are reliably 
associated with less bias against distant co-religionists (Supplementary 
Tables S5-S9). 

We checked whether the effects of moralistic gods’ punishment and 
knowledge were indeed specific to powerful, moralizing gods. We 
added local gods’ punishment and knowledge to the models presented 
in Table 2. Figure 3 shows the odds ratios and confidence intervals 
for these coefficients. Although neither the punishing powers nor 
knowledge of these local deities had any association with the alloca- 
tions, the odds ratios for our key predictors pertaining to moralistic 
gods actually increased. These overall findings are correlational and 
should be interpreted with caution and in combination with other 
evidence, also considering that religious priming did not reveal con- 
sistent effects. However, these patterns reduce concerns that omitted 
third variables might account for the correlations we observe. A third 
variable, in addition to correlating with allocations, would have to 
correlate only with the punishing and knowing character of mor- 
alistic and knowledgeable gods, but not with those same attributes 
in local gods or with the tendency of either type of deity to reward 
people. 

These results build on previous findings and have important impli- 
cations for understanding the evolution of the wide-ranging cooper- 
ation found in large-scale societies. Moreover, when people are more 
inclined to behave impartially towards others, they are more likely to 
share beliefs and behaviours that foster the development of larger-scale 
cooperative institutions, trade, markets and alliances with strangers. 
This helps to partly explain two phenomena: the evolution of large and 
complex human societies and the religious features of societies with 
greater social complexity that are heavily populated by such gods®”’. 
In addition to some forms of religious rituals and non-religious norms 
and institutions, such as courts, markets and police, the present results 
point to the role that commitment to knowledgeable, moralistic and 
punitive gods plays in solidifying the social bonds that create broader 
imagined communities’?!???, 


18 FEBRUARY 2016 | VOL 530 | NATURE | 329 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 10 June 2015; accepted 7 January 2016. 
Published online 10 February 2016. 


1. Richerson, P. J. & Boyd, R. Complex societies: the evolutionary origins of a 
crude superorganism. Hum. Nat. 10, 253-289 (1999). 

2. Turchin, P. in Cultural Evolution: Society, Technology, Language, and Religion (eds 
Richerson, P. J. & Christiansen, M. H.) 61-73 (MIT Press, 2013). 

3. Fehr, E., Fischbacher, U. & Gachter, S. Strong reciprocity, human cooperation, 
and the enforcement of social norms. Hum. Nat. 13, 1-25 (2002). 

4. Turchin, P, Currie, T. E., Turner, E. A. L. & Gavrilets, S. War, space, and the 
evolution of Old World complex societies. Proc. Nat! Acad. Sci. USA 110, 
16384-16389 (2013). 

5. Johnson, D. D. P. God’s punishment and public goods. Hum. Nat. 16, 410-446 
(2005). 

6. Norenzayan, A. Big Gods: How Religion Transformed Cooperation and Conflict 
(Princeton Univ. Press, 2013). 

7. Norenzayan, A. et al. The cultural evolution of prosocial religions. Behav. 

Brain Sci. http://dx.doi.org/10.1017/S0140525X14001356 (2015). 

8. Schloss, J. P. & Murray, M. J. Evolutionary accounts of belief in supernatural 
punishment: a critical review. Relig. Brain Behav. 1, 46-99 (2011). 

9. McNamara, R. A., Norenzayan, A. & Henrich, J. Supernatural punishment, 
in-group biases, and material insecurity: experiments and ethnography from 
Yasawa, Fiji. Relig. Brain Behav. 6, 34-55 (2016). 

10. Rossano, M. J. Supernaturalizing social life. Hum. Nat. 18, 272-294 
(2007). 

11. Sosis, R. & Bressler, E. R. Cooperation and commune longevity: a test of the 
costly signaling theory of religion. Cross-Cultural Res. 37, 211-239 (2003). 

12. Sosis, R. & Ruffle, B. J. Religious ritual and cooperation: testing for a 
relationship on Israeli religious and secular kibbutzim. Curr. Anthropol. 44, 
713-722 (2003). 

13. Atran, S. & Henrich, J. The evolution of religion: how cognitive by-products, 
adaptive learning heuristics, ritual displays, and group competition generate 
deep commitments to prosocial religions. Biol. Theory 5, 18-30 (2010). 

14. Pew Research Centres The Future of World Religions: Population Growth 
Projections 2010-2050. http://www.pewforum.org/files/2015/03/ 
PF_15.04.02_ProjectionsFullReport.pdf (2015). 

15. Botero, C. A. et al. The ecology of religious beliefs. Proc. Nat! Acad. Sci. USA 
111, 16784-16789 (2014). 

16. Watts, J. et al. Broad supernatural punishment but not moralizing high gods 
precede the evolution of political complexity in Austronesia. Proc. R. Soc. 

Lond. B 282, 20142556 (2015). 

17. Haidt, J. & Kesebir, S. in Handbook of Social Psychology 797-832 (Wiley, 2010). 

18. Purzycki, B. G. The minds of gods: a comparative study of supernatural 
agency. Cognition 129, 163-179 (2013). 

19. Purzycki, B. G. Tyvan cher eezi and the socioecological constraints of 
supernatural agents’ minds. Relig. Brain Behav. 1, 31-45 (2011). 

20. Purzycki, B. G. et al. What does God know? Supernatural agents’ access to 
socially strategic and non-strategic information. Cogn. Sci. 36, 846-869 
(2012). 


330 | NATURE | VOL 530 | 18 FEBRUARY 2016 


21. Atkinson, Q. D. & Bourrat, P. Beliefs about God, the afterlife and morality 
support the role of supernatural policing in human cooperation. Evol. Hum. 
Behav. 32, 41-49 (2011). 

22. Shariff, A. F. & Rhemtulla, M. Divergent effects of beliefs in heaven and hell on 
national crime rates. PLoS ONE 7, e39048 (2012). 

23. Bering, J. M., McLeod, K. & Shackelford, T. K. Reasoning about dead agents 
reveals possible adaptive trends. Hum. Nat. 16, 360-381 (2005). 

24. Piazza, J., Bering, J. M. & Ingram, G. ‘Princess Alice is watching you’: children’s 
belief in an invisible person inhibits cheating. J. Exp. Child Psychol. 109, 
311-320 (2011). 

25. Shariff, A. F., Willard, A. K., Andersen, T. & Norenzayan, A. Religious priming: 
a meta-analysis with a focus on prosociality. Personal. Soc. Psychol. Rev. 20, 
27-48 (2016). 

26. Henrich, J., Heine, S. J. & Norenzayan, A. The weirdest people in the world? 
Behav. Brain Sci. 33, 61-83 (2010). 

27. Henrich, J. et a/. Markets, religion, community size, and the evolution of 
fairness and punishment. Science 327, 1480-1484 (2010). 

28. Cohn, A., Fehr, E. & Maréchal, M. A. Business culture and dishonesty in the 
banking industry. Nature 516, 86-89 (2014). 

29. Hruschka, D. et a/. Impartial institutions, pathogen stress and the expanding 
social network. Hum. Nat. 25, 567-579 (2014). 

30. Chuah, S.-H., Hoffmann, R., Ramasamy, B. & Tan, J. H. W. Religion, ethnicity 
and cooperation: an experimental study. J. Econ. Psychol. 45, 33-43 (2014). 

31. Xygalatas, D. et al. Extreme rituals promote prosociality. Psychol. Sci. 24, 
1602-1605 (2013). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements We thank everyone who participated in the study and 

our local assistants without whom this project would not have been possible. 
We acknowledge funding from a research grant, “The Emergence of Prosocial 
Religions” from the John Templeton Foundation, and the Cultural Evolution of 
Religion Research Consortium (CERC), funded by a generous partnership grant 
(895-201 1-1009) from the Social Sciences and Humanities Research Council 
of Canada. Q.D.A. is grateful for the support provided by a Rutherford Discovery 
Fellowship, E.C. thanks the British Academy for Fellowship support, J.H. thanks 
the Canadian Institute for Advanced Research for support, A.N. thanks the 
James McKeen Cattell Foundation for sabbatical support, and B.G.P. thanks 

L. Loveridge and J. McCutcheon. We thank A. Baimel, A. Barnett, J. Bulbulia, 

N. Chan, M. Collard, T. Lai, J. Lanman, B. Milner, M. Muthukrishna, C. Placek, E. 
Slingerland, R. Sosis, H. Whitehouse and C. Xu. 


Author Contributions J.H., A.N. and B.G.P. conceived the study, prepared 
protocols, managed project communication and initiated manuscript 
preparation. C.A., Q.D.A., E.C., R.A.M., A.K.W., B.G.P. and D.X. collected data. 
B.G.P. conducted all analyses, made all graphs, Tables and illustrations. All 
authors participated in developing and refining protocols, experimental design 
and in manuscript preparation. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial 
interests. Readers are welcome to comment on the online version of the paper. 
Correspondence and requests for materials should be addressed to 

B.G.P. (bgpurzycki@alumni.ubc.ca) or J.H. (henrich@psych.ubc.ca). 


© 2016 Macmillan Publishers Limited. All rights reserved 


| Kyzyl, Tyva Republic 


| Hadzaland, Tanzania 
ee 


Pesqueiro, Brazil | > 


| Lovu, Fiji | Pointe aux Piments, Mauritius 


LETTER 


Inland Tanna 


Coastal Tanna 


Extended Data Figure 1 | Map of the eight field site locations. Map from R package ‘maps’ (2015). R version by Ray Brownrigg. R package version 


3.0.0-2 (http://CRAN.R-project.org/package=maps). 
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Extended Data Table 1 | Models accounting for allocations to anonymous distant co-religionists with punishment-knowledge aggregate 
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All models are binomial logistic regressions, backward selected for site inclusion. Models include field sites as fixed effects. Moralistic god variables are denoted as ‘MG’ and local god variables are 
denoted as ‘LG’. Pseudo R?s are Nagelkerke’s R2. ***P<0.001, **P<0.01, *P<0.05, +P<0.10, P<0.15. See Supplementary Information section $2.3 for variable definitions, Supplementary Information 
section S5 for discussion of analyses and Supplementary Table S9 for punishment-knowledge aggregate models with extreme values removed. 
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The genome of the seagrass Zostera marina reveals 
angiosperm adaptation to the sea 


Jeanine L. Olsen!*, Pierre Rouzé’, Bram Verhelst*, Yao-Cheng Lin”, Till Bayer, Jonas Collen*, Emanuela Dattolo’, 
Emanuele De Paoli®, Simon Dittami‘, Florian Maumus’, Gurvan Michel*, Anna Kersting®’, Chiara Lauritano°, Rolf Lohaus?, 


Mats Topel!® 


, Thierry Tonon’, Kevin Vanneste*, Mojgan Amirebrahimi", Janina Brakel’, Christoffer Bostrém”, 


Mansi Chovatia!"', Jane Grimwood!48, Jerry W. Jenkins!>!5, Alexander Jueterbock", Amy Mraz!®, Wytze T. Stam!, Hope Tice", 
Erich Bornberg-Bauer®, Pamela J. Green!®, Gareth A. Pearson!, Gabriele Procaccini™, Carlos M. Duarte!®, Jeremy Schmutz!4!3, 


Thorsten B. H. Reusch*!?* & Yves Van de Peer?:202h 


Seagrasses colonized the sea' on at least three independent occasions 
to form the basis of one of the most productive and widespread 
coastal ecosystems on the planet”. Here we report the genome of 
Zostera marina (L.), the first, to our knowledge, marine angiosperm 
to be fully sequenced. This reveals unique insights into the 
genomic losses and gains involved in achieving the structural and 
physiological adaptations required for its marine lifestyle, arguably 
the most severe habitat shift ever accomplished by flowering 
plants. Key angiosperm innovations that were lost include the 
entire repertoire of stomatal genes’, genes involved in the synthesis 
of terpenoids and ethylene signalling, and genes for ultraviolet 
protection and phytochromes for far-red sensing. Seagrasses have 
also regained functions enabling them to adjust to full salinity. Their 
cell walls contain all of the polysaccharides typical of land plants, 
but also contain polyanionic, low-methylated pectins and sulfated 
galactans, a feature shared with the cell walls of all macroalgae* 
and that is important for ion homoeostasis, nutrient uptake and 
O,/CO, exchange through leaf epidermal cells. The Z. marina 
genome resource will markedly advance a wide range of functional 
ecological studies from adaptation of marine ecosystems 
under climate warming”®, to unravelling the mechanisms of 
osmoregulation under high salinities that may further inform our 
understanding of the evolution of salt tolerance in crop plants’. 

Seagrasses are a polyphyletic assemblage of basal monocots belong- 
ing to four families in the Alismatales’” (Supplementary Note 1.1 
and Supplementary Fig. 1.1). As a functional group, they provide the 
foundation of highly productive ecosystems present along the coasts 
of all continents except Antarctica, where they rival tropical rain 
forests and coral reefs in ecosystem services*”. In colonizing sedimen- 
tary shorelines of the world’s ocean, seagrasses found a vast new habitat 
free of terrestrial competitors and insect pests but had to adapt to cope 
with new structural and physiological challenges related to full marine 
conditions. 

Zostera marina (Zosteraceae), or eelgrass (Fig. 1), is the most wide- 
spread species throughout the temperate northern hemisphere of 


the Pacific and Atlantic’®. A clone of Z. marina was sequenced from 
the Archipelago Sea, southwest Finland, using a combination of fos- 
mid-ends and whole-genome shotgun (WGS) approaches (Methods, 
Supplementary Note 2). The 202.3 Mb Z. marina genome encodes 20,450 
protein-coding genes, 86.6% of which (17,511 genes, Supplementary 
Note 3.1) are supported by transcriptome data from leaves, roots and 
flowers (Extended Data Fig. 1, Supplementary Notes 3.2-3.3 and 
Supplementary Data 1-3). Genes are located in numerous gene-dense 
islands separated by stretches of repeat elements accounting for 63% of 
the non-gapped assembly (Extended Data Fig. 2, Supplementary Note 
3.1) as compared to only 13% in the only other sequenced alismatid, 
the freshwater duckweek, Spirodela polyrhiza (Alismatales, Araceae)". 
Gypsy-type (32%) and Copia-type (20%) transposable elements contrib- 
ute to most of the repetitive DNA. Sequence divergence analysis suggests 
that the genome retains copies from two distinct periods of invasion by 
Copia elements, but only one period for Gypsy elements (Extended Data 
Fig. 3a—c). Genes gained by Z. marina (‘accessory’) are located closer to 
transposable elements than to conserved (‘single copy’) genes (Fisher's 
exact test, P<0.0001) indicating that transposable elements may have 
played a role in genic adaptation. 

We identified 36 conserved microRNAs with high confidence and 
their predicted targets (Supplementary Note 3.4, Supplementary Data 4 
and 5). A novel variant of miR528 (not present in Spirodela) was found 
to be the only member of this miRNA family, and demonstrates that 
this conserved miRNA is the only one ancestral to the entire monocot 
lineage. Most likely, Z. marina did not take part in the subsequent birth 
of miRNAs that are common to several other monocots!?; nor did it 
experience or retain traces of prominent miRNA duplications. 

Analysis of synonymous substitutions per synonymous site (Ks) age 
distributions indicates that Z. marina carries the remnants of an inde- 
pendent, ancient whole-genome duplication (WGD) event (Fig. 2a, 
Supplementary Note 4.1)'°. Duplicated segments account for ~9% 
of the Z. marina genome, probably an underestimate due to the 
fragmented nature of the assembly. Zostera and Spirodela diverged 
somewhere between 135 and 107 million years ago (Mya)'* and 
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Figure 1 | Zostera marina and phylogenetic tree showing gene family 
expansion/contraction analysis compared with 13 representatives of the 
Viridiplantae. a, Gains and losses are indicated along branches and nodes. 
The number of gene families, orphans (single-copy gene families) and 


phylogenomic dating" of the Z. marina WGD suggests that it occurred 
72-64 Mya (Fig. 2b), thus independently from the two WGDs reported 
for S. polyrhiza''. This timeframe coincides with the initial diversi- 
fication of a freshwater clade that includes three of the four families 
of seagrasses (Supplementary Table 1.1) and with the Cretaceous- 
Palaeogene (K-Pg) extinction event (Fig. 2c), which provided new 
ecological opportunities and may have triggered seagrass adaptive 
radiations. 

We mapped signatures of loss and gain of gene families 
(Supplementary Note 4.2) onto a phylogenetic tree (Fig. 1a). We also 
mapped losses and gains of Pfam domains (Supplementary Fig. 4.4, 
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number of predicted genes is indicated next to each species. Background 
colours (top to bottom) are Alismatales, other monocots, dicots, mosses/ 
algae b, Typical Zostera marina meadow, Archipelago Sea, southwest 
Finland (photo by C.B.). 


Supplementary Data 6). While many genes are shared between Zostera 
and Spriodela, clearly some losses and gains are unique to Zostera in 
relation to its marine environment, the alismatid lineage having set the 
stage for the subsequent freshwater-marine transition. Those unique 
to Z. marina include the absence of all the genes involved in stomatal 
differentiation (Fig. 3a, Extended Data Table 1 and Supplementary 
Note 5.1) and the disappearance of genes comprising entire path- 
ways encoding volatiles synthesis and sensing (Supplementary Note 
6.1), such as those for ethylene! (Fig. 3b, Extended Data Table 2). 
Terpenoid genes are also drastically reduced to two (Fig. 3c), as com- 
pared with four in Spirodela, 50 in Oryza and > 100 in Eucalyptus, thus 


Figure 2 | Ancient whole-genome duplication (WGD). a, Ks-based 

age distribution of the whole Z. marina paranome. The x axis shows the 
synonymous distance until a Ks cut-off of 2, in bins of 0.04, containing 
the Kg values that were used for mixture modelling (excluding those with 
a Ks < 0.1). The component of the Gaussian mixture model plotted in red 
(as identified by EMMIX) corresponds to a WGD feature based on the 
SiZer analysis (other components are shown in black). The transition from 
the blue to the red at a Ks of ~0.8 in the SiZer panel (below) indicates a 
change in the distribution and therefore provides evidence for an ancient 
WGD (Supplementary Table 4.1, Supplementary Fig. 4.1). b, Absolute age 
distribution obtained by phylogenomic dating of Z. marina paralogues. 
The solid black line represents the kernel density estimate (KDE) of the 
dated paralogues and the vertical dashed black line represents its peak, 
used as the consensus WGD age estimate, at 67 Mya. Grey lines represent 
the density estimates from 2,500 bootstrap replicates and the vertical black 
dotted lines represent the corresponding 90% confidence interval for the 
WGD age estimate, 64-72 Mya. The original raw distribution of dated 
paralogues is indicated by the circles. The y axis represents the percentage 
of gene pairs. c, Pruned phylogenetic tree with indication of WGD events 
(boxes)*’. The Cretaceous-Palaeogene (K-Pg) boundary is indicated by 
an arrow. 
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Figure 3 | Reconstruction of metabolic (or gene) pathways involved in isopentenyl pyrophosphate; DMAPP, dimethylallyl pyrophosphate; FPP, 


the production of stomata, ethylene, terpene and pollen in Z. marina. farnesyl pyrophosphate; GPP, geranyl diphosphate; GGPP, geranylgeranyl 
a, Stomata differentiation from meristemoid mother cells (MMC) to pyrophosphate; CPP, copalyl pyrophosphate; GA, gibberellic acid; PP, 
guard mother cell (GMC) to guard cells. b, Ethylene synthesis and diphosphate; ABA, abscisic acid. d, Sporopollenin biosynthesis genes; 
signalling up to EIN2 have disappeared; EIN3 and its downstream targets regulatory genes in the nucleus control downstream processes (arrows) in 
remain. c, Terpenoid biosynthesis in which the pathways producing response to signalling coming from external stimuli through receptors on 
volatiles are absent but those essential for primary metabolism remain. the plasma membrane. All panels: genes in red are absent; blue are present; 
MVA, mevalonate; MEP, plastidic methylerythritol phosphate; IPP, the grey line represents the plasma membrane. See Extended Data Tables 1-3. 


precluding synthesis of secondary volatile terpenes (Supplementary with non-photochemical quenching (NPQ), thereby enhancing per- 
Fig. 6.2). Only aromatic acid decarboxylases (AAAD) genes were formance at low light (Extended Data Fig. 4). 
expanded (Supplementary Fig. 6.3) and these form a clade distinct Seagrasses typically experience full marine seawater (35 gkg~!)!7, 
from Spirodela. The loss of volatiles is also consistent with the loss of | whereas land plants obtain water with low osmolality (0-2 gkg~!) via 
stomata, through which they are emitted for airborne communication _ the rhizosphere and aquatic plants experience fresh (0-5 gkg™') to 
and plant defence. The repertoire of defence-related genes such as the _ brackish (0.5-20 gkg~') conditions. Although Z. marina displays 
six groups of NBS_LRR resistance genes (Supplementary Note 6.2) a typical repertoire of Na* and K* antiporters (Supplementary 
is also reduced to 44 (89 in Spirodela and 100-300 in other plants), Note 8, Supplementary Table 8.1), one of six H*-ATPase (AHA) 
which may be linked to a lower probability of infection of Z. marina genes (Supplementary Table 8.2, Supplementary Data 7) is strongly 
due to the absence of stomata, which are a main entry point for pests expressed in vegetative tissue and encodes a salt-tolerant Ht- 
and pathogens in terrestrial plants. ATPase. Furthermore, Z. marina possesses three AHA genes (along 
Land and aquatic floating plants (Embryophyta) are often with Spirodela) in a cluster unique to alismatids (Supplementary 
exposed to intense ultraviolet (UV) radiation and have devel- Fig. 8.1). 
oped light sensing protein receptors with protective and signal- Uniquely, Z. marina has re-evolved new combinations of structural 
ling functions. In contrast, Z. marina inhabits a light-attenuated, _ traits related to the cell wall. Synthesis of cutin-cuticular waxes to the 
submarine environment where it must cope with shifted spectral outside of the leaf epidermis and suberin-lignin near the plasma mem- 
composition, characterized by low penetration of UV-B, red and _ brane (Supplementary Note 9, Supplementary Table 9.1) surround a cell 
far-red wavelengths!®. Accordingly, Z. marina has lost ultraviolet- wall matrix of (hemi)celluloses, low-methylated pectin (zosterin) and 
resistance (UVR8) genes associated with sensing and responding macroalgal-like sulfated polysaccharides'® (Supplementary Note 10). 
to UV damage (Spirodela has not), as well as phytochromes associ- The reduction in carbohydrate-related genes that modify the fine struc- 
ated with red/far-red receptors (Supplementary Note 7). Whereas _ ture of cell wall hemicelluloses and pectins in Z. marina is not due to 
photosystems (PSI and PSII) are similar to those of other plants _ loss of pathways, but rather to the large variation within these CAZyme 
including Spirodela, members of the light-harvesting complex B__ gene families in plants. Available genomes (including Spirodela) lack 
(LHCB) family are expanded in number, possibly in combination carbohydrate sulfotransferases and sulfatases, suggesting that land 
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plants have lost these genes as a key adaptation to terrestrial as well 
as freshwater conditions!?”°. In contrast, Z. marina has regained the 
ability to produce sulfated polysaccharides with an expansion of aryl 
sulfotransferases (12 genes) homologous to ary] sulfotransferases from 
land plants (Supplementary Note 10). Sulfation facilitates water and ion 
retention in the cell wall to cope with desiccation and osmotic stress 
at low tide and, likewise, low methylation of zosterin correlates with 
the expanded pectin carbohydrate esterase 8 (CE8) family, increas- 
ing the polyanionic character of the cell wall matrix. We speculate 
that several aryl sulfotransferases have evolved because carbohydrate 
sulfatases have been shown to be active on artificial aryl compounds 
such as methylumbelliferyl-sulfate?!. Osmotic equilibrium is further 
achieved in Z. marina by organic osmolytes (mainly sucrose, tre- 
halose and proline) in combination with a small cytoplasm:vacuole 
volume ratio (10%)”2. Given that up to 90% of fixed carbon is stored as 
sucrose in the rhizomes, sucrose synthase (SuSy) and transport (SUT) 
genes are expanded while those for starch metabolism are greatly 
reduced, as expected in ‘marine sugarcane’ (Supplementary Note 7.2, 
Supplementary Data 8). 

The repertoire of redox and other stress-resistance genes 
(Supplementary Note 8) is typical for angiosperms with the exception 
of catalase (CAT), which is reduced to a single copy in Z. marina (two 
in Spirodela). Late embryogenesis abundant (LEA) and dehydrins are 
clearly under-represented in both Zostera and Spriodela relative to other 
genomes. In contrast, Zostera possesses an unusual complement of 
metallothioneins. Aside from their role as chelators, metallothioneins 
may be involved in stress resistance; one of these, MT2L, is among 
the most highly constitutively expressed genes in Z. marina (Extended 
Data Fig. 5, Supplementary Note 8.2). 

Sexual reproduction of Z. marina takes place underwater, involving 
completely submerged male and female flowers, and a unique exine-less, 
filiform pollen that winds around the bifurcate stigmas in a purely abiotic 
pollination process”’. Note that freshwater alismatids (and also 
Spirodela)™* possess pollen with an exine layer. Exine-less pollen” 
is characteristic of all seagrasses except Enhalus acoroides (which is 
surface pollinated). Ten genes specifically involved in biosynthesis 
and modification of the pollen exine coat are missing; all other genes 
involved in the development of viable pollen remain intact (Fig. 3d, 
Extended Data Table 3, Supplementary Note 11.1). Finally, MADS-box 
gene transcription factors are also highly reduced to 50 in Z. marina, 
which is most likely related to its highly reduced flowers (also a fea- 
ture of Spirodela) that lack the first two whorls of specialized floral 
leaves, calyx and corolla (Supplementary Note 11.2, Supplementary 
Table 11.2). 

An increasing proportion of the world population inhabits the 
coastal zone. This impinges multiple pressures on ecosystems including 
seagrass beds”®”’, which in turn compromises the ecosystem services 
they may provide, including provisioning of harvestable fish and inver- 
tebrates, nutrient retention, carbon sequestration and erosion control. 
In the context of seagrass conservation, elucidating the genomic basis of 
Z. marina’s complex adaptations to ocean waters (Extended Data Fig. 6) 
will also inform the development of molecular indicators of their 
physiological status”*, as these unique ecosystems rank, unfortunately, 
among the most threatened on Earth**””, 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


No statistical methods were used to predetermine sample size. The experiments 
were not randomized and the investigators were not blinded to allocation during 
experiments and outcome assessment. 

Plant material and DNA preparation. A single genotype/clone of Zostera marina 
(referred to as the ‘Finnish clone’) was harvested on 26 August 2010 at 2m depth at 
Faré Island (latitude 59° 55.234! N longitude 21° 47.766’ E) located in the northern 
Baltic Sea, Finland. Plant material was transported to the lab in seawater, cleaned 
and further processed. Care was taken to use leaf-meristem tissue harvested from 
the inner layer of basal shoots to minimize bacterial/diatom contamination. Tissues 
were immediately frozen in LN2 and stored at —80°C for later DNA and RNA 
extraction. Monoclonality was verified by genotyping 40 ramets of the mega-clone 
with six highly polymorphic, microsatellite loci*’. There was no evidence for poly- 
ploidy”! (Z. marina is 2n= 12) or somatic mutations*? as assessed by multiple 
peaks in the microsatellite chromatograms. Tissue was subsequently sent on dry ice 
to Amplicon Express for HMW DNA extraction using a CTAB isolation method 
modified by R. Meilan (unpublished) but available from him (rmeilan@purdue. 
edu), based on the original method**. Following QC according to JGI guidelines, 
the DNA was shipped to JGI for library and sequencing preparation. 

Genome sequencing and assembly. One 35-Kb, fosmid library was generated for 
end sequencing. The fosmid ends were sequenced with standard Sanger sequenc- 
ing protocols at the HudsonAlpha Institute for a total of 194,303 Sanger reads 
(0.29 coverage). Illumina libraries (two fragment libraries (6.62 Gb), one 2-Kb 
JGI mate-pair library (3.57 Gb), one 4-Kb JGI mate-pair library (3.41 Gb) and two 
8-Kb JGI mate-pair libraries (11.94 Gb)) were sequenced with Illumina MiSeq/ 
HiSeq genetic analysers at the Department of Energy’s Joint Genome Institute 
(GD), using standard protocols. A total of 25.55 Gb of Illumina and 0.14 Gb of 
Sanger sequence was obtained representing 47.7 x genomic coverage. Prior to 
assembly, all reads were screened against mitochondria, chloroplast, and Illumina 
controls. Reads composed of > 95% simple sequence repeats were removed. For 
the Illumina, paired-end libraries (2 x 250), reads <75 bp were discarded, for the 
2 x 150 libraries, reads <50 bp were discarded after trimming for adaptor and 
quality (q< 20). An additional deduplication step was performed on the mate 
pairs that identified and retained only one copy of each PCR duplicate. A total of 
212,101,273 reads (Supplementary Table 2.1) was assembled using our modified 
version of Arachne v. 20071016 (ref. 35). Subsequent directed Arachne modules 
were applied to collapse adjacent heterozygous contigs. The entire assembly was 
then run through another Arachne process starting at Stage 6 Rebuilder. This 
produced 15,747 scaffold sequences (30,723 contigs), with a scaffold L50 of 
409.5 Kb, 613 scaffolds larger than 100 Kb, and a total genome size of 237.5 Mb 
(Supplementary Table 2.2). 

Scaffolds were screened against bacterial proteins, organelle sequences, 
GenBank NR (nr_prot) and RefSeq protein databases, and removed if found to 
be a contaminant. Scaffolds consisting of prokaryotes, chloroplast, mitochondria 
and unanchored rDNA were removed. We also assembled the chloroplast and 
partial mitochondrial genomes (Supplementary Notes 2.2 and 2.3, Supplementary 
Fig. 2.1). Additionally, short (<1 Kb) scaffolds or scaffolds containing highly 
repetitive sequence ( > 95% 24-mers found more than four times in large scaf- 
folds) or alternative haplotypes were also removed. Following repeat analysis and 
gene prediction, all scaffolds were subjected to a filtering process (based on NCBI 
nr_prot + NCBI taxonomy database) to eliminate remaining bacterial (and other) 
contaminants (Supplementary Table 2.3). 

Assembly validation was performed using a set of 12 fully sequenced fosmid 
clones. In 4 of the 12 fosmid clones, full-length alignments were not found due to 
fragmentation in the region of the fosmid clone. In five of the remaining eight fos- 
mid clones, the alignments were of high quality (<0.05% bp error). The overall base 
pair error rate (including marked gap bases) in the fosmid clones that aligned to 
full length was 0.28% (714 discrepant base pairs out of 253,332 bp). Supplementary 
Table 2.4 shows the individual fosmid clones and their contribution to the overall 
error rate. Note that two fosmid clones (16248, 16249) contributed nearly 81% of 
the discrepant bases. This probably occurred in polymorphic regions of the genome 
where the haplotype in the fosmid did not match the haplotype in the reference. 
There are several indels of various sizes in the clone and assembly, typical of a 
region of degraded transposons. Further quality analysis indicated that 90% of 
the set of eukaryotic core genes (CEGMA) were present and 98% were partially 
represented, suggesting near completeness of the euchromatin component. 
Annotation of repetitive sequences. Two complementary approaches were used 
to identify repetitive DNA sequences in the Z. marina genome. With respect to 
masking repeats before gene prediction analysis, a de novo repeat identification 
was carried out with RepeatModeller (v. open-1.0.7; http://www.RepeatMasker. 
org)*° to identify repeat boundaries and build consensus models from which 
potential over represented, non-transposable element, protein-coding genes were 
removed. RepeatMasker (v. open-4.0.0, WUBlast) was used in combination with 


this custom repeat library to mask the assembly and prepare it for gene prediction 
with EuGene. 

Furthermore, in order to perform a qualitative and quantitative analysis of repeats 
with greater resolution” the genome assembly was processed for de novo repeat 
detection using the TEdenovo pipeline from the REPET package v. 2.2 (ref. 38); 
parameters were set to consider repeats with at least five copies. The consensus 
sequences generated by TEdenovo were then used as probes for whole genome 
annotation by the TEannot* pipeline from the REPET package v. 2.2. The con- 
sensus repeat sequences were classified using Pastec*”. Comparing the genomic 
positions of transposable elements (TE) to those of exons from the set of predicted 
genes enabled us to identify that 909 gene predictions most likely represent TEs 
and these were filtered from the gene set. The REPET package v. 2.2 was also used 
to annotate repetitive elements in the Spirodela polyrhiza genome assembly with 
the same parameters as for Z. marina. See Supplementary Fig. 3.1. 
Transcriptome library preparation, sequencing and assembly. Leaf, root and 
flower tissues were separately frozen in liquid nitrogen immediately follow- 
ing harvest from either ambient (field collected) or experimental (mesocosm) 
conditions (Supplementary Note 3.2). Overall, we obtained between nine and 
20 million high-quality reads from each of the flower-leaf-root replicate libraries; 
and for the Finnish clone library, 148.5 million high quality reads were retrieved 
(Supplementary Table 3.3). 

The de novo assembly protocol was adapted from ref. 41. We pooled replicates 
of each tissue together except for the two leaf tissue libraries, which were kept sep- 
arate (Supplementary Table 3.4) and performed de novo transcriptome assembly 
for each tissue using Trinity*'(v. 2014-07-17) with digital normalization option 
ON to normalize input read coverage. Frame shift errors and insertion/deletion 
errors in the assembled transcripts were corrected by FrameDP’”. Because a 
de novo assembly still generates many spurious transcripts, we used the transcript 
expression value to remove low quality contigs. We used the RSEM pipeline® to 
obtain the contig expression values and removed contigs with FPKM (fragments 
per kilobase of transcript per million fragments mapped) value <1 and IsoPct 
(percentage of expression for a given transcript compared with all expression from 
that Trinity component) < 1. In total, we obtained between 39,000 and 53,000 
assembled contigs from each library, and 52,000 contigs from the Finnish clone 
library (Supplementary Table 3.4). Prior to mapping the genome sequence and the 
predicted genes, we used the CD-HIT“ program (v. 4.6.1) to collapse redundant 
contigs, which resulted in 79,134 low redundant transcript contigs. 

Differential gene expression analysis. High-quality RNA-seq reads were mapped 
to the genome assembly v.2.1 by TopHat”. Differential gene expression analysis 
was performed by the Cufflink pipeline* based on the Z. marina v.2.1 gene models 
by converting the number of aligned reads into FPKM values. Genes with signif- 
icant expression difference (log, > 2) were selected for further investigation by 
GOstats*® to perform Gene Ontology (GO) term enrichment analysis with P< 0.05 
(Supplementary Note 3.3, Supplementary Table 3.5) 

MicroRNA analysis. Genomic precursors of known miRNAs were mapped on 
the Z. marina genome following the procedure described in ref. 47 for the maize 
genome. miRNA entries from the miRBase database (release 21, 2014) were 
aligned to the chromosomes of the Z. marina genome. Up to three mismatches 
were allowed in the alignment, using SeqMap**. In parallel, novel potential DCL1/ 
AGO1-dependent miRNAs were enriched by selecting 5’-U 20-22 nt small RNAs 
from three different sequenced libraries from Z. marina described in ref. 12. A 
subset of these small RNAs with abundance >10 TPM (transcripts per million) 
was retained and aligned to the genome with no mismatches. From every locus, 
we extracted two ~200-nt regions surrounding each aligned miRNA or candidate 
(from —30 to +160 and from —160 to +30 nucleotides relative to the putative 
miRNA start or end coordinate, respectively). Minimum energy RNA secondary 
structures were predicted for each region using the RNAfold program of the Vienna 
RNA 1.8.5 package (http://www.tbi.univie.ac.at/~ivo/RNA/) using default settings. 

In addition, small RNAs from the three sequenced libraries were mapped on 
these regions, allowing no mismatches, in order to pre-select putative miRNA 
loci that showed evidence of expression in the three plant tissues analysed. We 
evaluated RNA structure and small RNA alignment in all the regions based on: 
(1) dominance of plus-stranded small RNAs; (2) position of the most abundant 
small RNAs relative to the predicted miRNA coordinates; (3) prevalence of 20-22 nt 
small RNAs in the predicted miRNA locus; (4) position of the putative miRNA with 
the stem-loop structure; and (5) absence of oversize (>3 nt) bulges in the miRNA/ 
miRNA* alignment. After reduction of overlapping loci to a non-redundant set 
and removal of stem-loop structures with the wrong orientation compared to 
miRNAs registered in miRBase, we manually inspected the remaining loci to 
further evaluate them according to the miRNA annotation criteria proposed by 
ref. 49. Stringency was relaxed when small RNA expression data strongly indicated 
the presence of miRNA loci that did not meet the whole set of criteria. Novel miRNA 
precursors overlapping with TEs or other repetitive elements were filtered out. 
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Potential miRNA targets were identified in silico using the generic small RNA- 
transcriptome aligner GSTAr from the CleaveLand package (v. 4)°°. Predicted 
targets were accepted with an Allen score <4 or a MFE (minimum free energy) 
ratio > 7.5. (Supplementary Note 3.4). 

Gene prediction. Training of the gene prediction programs started with the col- 
lection of high quality ESTs. EST information was used, for example, to train the 
splice predictor SpliceMachine”!. Detection of conserved splice sites was further 
investigated by RNA-seq splice junctions (count > 10) to construct a WAM model 
in EuGene (v. 4.1)°*. Coding-potential was modelled with an interpolated Markov 
Model (IMM) constructed from the BLASTX alignments of proteins from the 
PLAZA v. 2.5 database*’. An additional protein ‘monocot’ Markov Model was built 
based on the protein sequences from Brachypodium, maize and sorghum. Starting 
from EST and protein alignments, a set of 215 gene models was manually con- 
structed and curated using the genome browser GenomeView™. The 215 models 
were then used as a training set for EuGene in order to optimize the different splice 
site and coding-potential models, as well as the weights for the extrinsic EST and 
homology evidence. An overall fitness score of 80.1% was achieved, which is high 
enough to obtain reliable results without overfitting. GeneMark** and Augustus”° 
were separately trained (using the same input data as EuGene) and their predic- 
tions were integrated with EuGene using a custom script to evaluate the best gene 
structure at each locus. All gene models were automatically screened to highlight 
possible erroneous structures (for example, in-frame stop codons, deviating splice 
junctions) and manually curated. Transfer-RNA gene models were predicted by 
tRNAscan-SE (v. 1.31)’ and their structures were verified with Infernal (v. 1.1rcl, 
rfam11 covariant model database)°*. For each gene, UTRs were assigned by iden- 
tifying a set of ESTs and RNA-seq assemblies that uniquely overlapped with it. We 
subsequently selected the longest mapped transcript on either end of the predicted 
coding sequence and designated the section outside the coding sequence as the 
UTR. Finally, all genes were uploaded to the ORCAE platform (http://bioinfor 
matics.psb.ugent.be/orcae)°”, enabling all members of the consortium to refine 
and curate the gene model and assign gene function. A list of protein domains, 
as well as the derived Gene Ontology (GO) terms and KEGG pathway identifi- 
ers were generated using an InterProScan (v. 5.2.45) analysis and available in 
ORCAE. More specifically, gene functional descriptions were added either man- 
ually by consortium expert scientists or automatically through sequence homol- 
ogy searches. The automated method relies on the EC (Enzyme Commission) 
number reported by InterProScan to retrieve the enzyme name with BLASTP 
search against UniProtKB/Swiss-Prot® to filter out hits that are below 60% identity 
and 70% query/hit coverage. Although such high stringency on per cent identity 
and sequence coverage reduced the available number of functional descriptions, 
it reduced the false-positive prediction rate, as desired here. 

Construction of age distributions and WGD analyses. Ks-based age distributions 
were constructed as previously described. In brief, the Ks values between genes were 
obtained through maximum likelihood estimation using the CODEML program® 
of the PAML package (v. 4.4c)®*. Gene families for which Kg estimates between 
members did not exceed a value of 5 were subdivided into subfamilies. For each 
duplicated gene in the resulting phylogenetic gene tree, obtained by PhyML®, all m 
Kg estimates between the two child clades were added to the Kg distribution with a 
weight 1/m (where m is the number of Ks estimates for a duplication event), so that 
the weights of all Ks estimates for a single duplication event summed to one. Mixture 
modelling was used to confirm a WGD signature in the Kg distribution (Fig. 2 and 
Supplementary Fig. 4.1), for which all duplicates with Ks values <0.1 were excluded 
to avoid the incorporation of allelic and/or splice variants, while all duplicates with 
Kg values > 2.0 were removed because Kg saturation and stochasticity can mislead 
mixture modelling above this range™. For further details see Supplementary Note 4.1. 

Absolute dating of the identified WGD event was performed as described 
previously'?”*. In brief, paralogueous gene pairs located in duplicated segments 
(anchors) and duplicated pairs lying under the WGD peak (peak-based duplicates) 
were collected for phylogenetic dating. Anchors, assumed to be corresponding 
to the most recent WGD, were detected using i-ADHoRe 3.0 (refs 66,67). Only a 
low number of duplicated segments and hence anchors could be identified, most 
likely because of the fragmented assembly of Z. marina. However, the identified 
anchors did confirm the presence of a broad WGD peak between a Ks of 0.8 and 
1.6 (data not shown). For each WGD paralogueous pair, an orthogroup was created 
that included the two paralogues plus several orthologues from other plant spe- 
cies as identified by InParanoid (v. 4.1) using a broad taxonomic sampling: one 
representative orthologue from the order Cucurbitales, two from the Rosales, two 
from the Fabales, two from the Malpighiales, two from the Brassicales, one from 
the Malvales, one from the Solanales, two from the Poales, one orthologue from 
Musa acuminata® (Zingiberales), and one orthologue from Spirodela polyrhiza™! 
(Alismatales). In total, about 180 orthogroups from anchor pair duplicates and 
peak-based duplicates were collected. The node joining the two Z. marina WGD 
paralogues was then dated using the BEAST v. 1.7 package”’ under an uncorrelated 
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relaxed clock model and a LG+G (four rate categories) evolutionary model. A 
starting tree with branch lengths satisfying all fossil prior constraints was created 
according to the consensus APGIII phylogeny’. Fossil calibrations were imple- 
mented using log-normal calibration priors on the following nodes: the node 
uniting the Malvidae based on the fossil Dressiantha bicarpellata”* with prior 
offset = 82.8, mean = 3.8528, and s.d. = 0.5)(ref. 73), the node uniting the Fabidae 
based on the fossil Paleoclusia chevalieri’* with prior offset = 82.8, mean = 3.9314, 
and s.d. = 0.5(ref. 75), the node uniting the Alismatales (including Z. marina and 
Spirodela polyrhiza) with the other monocots based on the oldest fossil mono- 
cot pollen, Liliacidites’®”’ from the Trent’s Reach locality, with prior offset = 125, 
mean = 2.0418, and s.d.=0.5 (refs 14,78) and the root with prior offset = 124, 
mean = 4.0786, and s.d.=0.5 (ref. 79). The offsets of these calibrations repre- 
sent hard minimum boundaries, while their means represent locations for their 
respective peak mass probabilities in accordance with some of the most recent 
and taxonomically complete dating studies available for these specific clades'**°. 
A run without data was performed to ensure proper placement of the marginal 
calibration prior distributions*. The Markov chain Monte Carlo (MCMC) for 
each orthogroup was run for 10° generations, sampling every 1,000 generations 
resulting in a sample size of 10‘. The resulting trace files of all orthogroups were 
evaluated manually using Tracer v. 1.5”° with a burn-in of 1,000 samples to ensure 
proper convergence (minimum ESS for all statistics at least 200). In total, 169 
orthogroups were accepted and all age estimates for the node uniting the WGD 
paralogous pairs were then grouped into one absolute age distribution (Fig. 2, too 
few anchors were available to evaluate them separately from the peak-based dupli- 
cates), for which kernel density estimation (KDE) and a bootstrapping procedure 
were used to find the peak consensus WGD age estimate and its 90% confidence 
interval boundaries, respectively. 

Intra- and inter-genomic co-linearity was investigated (Supplementary Tables 

4.2 and 4.3) using MCScanX* based on a BLASTP search of all genomic protein 
coding genes with an E-value cut-off of e~'°. Only one large duplicated segment 
was detected, which was most likely due to the fragmented assembly of Z. marina; 
only 27 scaffolds had a size larger than 1 Mb, accounting for only 23.4% of all 
protein-coding genes. We therefore additionally used i-ADHoRe (v. 3.0)© to inves- 
tigate genomic co-linearity by including all possible scaffolds. 
Gene family comparisons. Protein sets were collected for 14 species: Z. marina 
(ORCAE v. 2.1), Arabidopsis thaliana (TAIR10), Thellungiella parvula (http://thel 
lungiella.org) Populus trichocarpa (Phytozome v. 9.0), Vitis vinifera (Phytozome 
v. 9.0), Amborella trichopoda (http://amborella.huck.psu.edu), Oryza sativa 
japonica (Phytozome v. 9.0), Zea mays (Phytozome v. 9.0), Brachypodium distach- 
yon (Phytozome v. 9.0), Spirodela polyrhiza (http://mocklerlab.org), Selaginella 
moellendorffii (Phytozome v. 9.0), Physcomitrella patens (Phytozome v. 9.0), 
Chlamydomonas reinhardtii (Phytozome v. 9.0), and Ostreococcus lucimarinus 
(ORCAE v. 6/3/2013). These species were selected in order to provide a phyloge- 
netic representation traversing green algae, basal plants, monocots, and dicots. 
Following an ‘all-vresus-all’ TimeLogic Decypher Tera-BLASTP (Active Motif Inc.; 
e-value threshold 1 x e~°, max hits 500) comparison, OrthoMCL (v. 2.0; mcl infla- 
tion factor 3.0)*° was used to delineate gene families. Confidence in establishing 
gene losses in Zostera was enhanced by using a combination of reciprocal blast, 
TblastN, re-annotation of Spirodela (and other monocot genes), and careful phy- 
logenetic analysis. OrthoMCL results and related protein resources are available 
in the ORCAE download section. 

To further understand gene family expansion or contraction in Z. marina in 
comparison with other sequenced genomes, gene family sizes were calculated for 
all gene families (excluding orphans and species-specific families) (Supplementary 
Note 4.2). The number of genes per species for each family was transformed into a 
matrix of z-scores in order to centre and normalize the data. The first 100 families 
with the largest gene family size in Z. marina were selected. The z-score profile was 
hierarchically clustered (complete linkage clustering) using Pearson correlation as 
a distance measure. The functional annotation of each family was predicted based 
on sequence similarity to entries in the InterProScan and Pfam protein domain 
database where more than 30% of proteins in the family share the same protein 
domain. The phylogenetic profile and phylogenetic tree topology provided at 
PLAZA“ were used to reconstruct the most parsimonious series of gene gain and 
loss events. The Dollop program from the PHYLIP package*° was used to deter- 
mine the minimum gene set at ancestral nodes of the phylogenetic tree. The Dollop 
program is based on the Dollo parsimony principle, which assumes that novel 
gene families arise exactly once during evolution but can be lost independently in 
different phylogenetic lineages. 

Search for presence/absence of orthologues for specific genes and families. 
A dedicated search for orthologues/homologues was performed for genes and 
proteins involved in stomata differentiation (Supplementary Note 5.1), volatile 
biosynthesis and sensing with focus on ethylene and terpenes (Supplementary 
Note 6.1), as well as genes involved in male flower specification and pollen 
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differentiation (Supplementary Note 11.1). To this end, queries were cho- 
sen from documented genes involved in these pathways (usually from 
Arabidopsis but occasionally from Oryza, Zea and tomato). Next, the search 
for homologues in Zostera marina, Spirodela polyrhiza, Oryza sativa japon- 
ica and Arabidopsis thaliana (when not used as a query) was performed using 
BLASTP. To avoid missing or poorly annotated genes a TBLASTN search was 
conducted using the above queries against the Zostera marina and Spirodela 
polyrhiza genomes. Putative orthologues were identified based on reciprocal 
BLASTP searches with Arabidopsis (or the other queries). Owing to species- 
specific duplications, this may produce some paralogous genes to appear orthol- 
ogous to the query, or vice versa (see Extended Data Tables 1-3). To further con- 
firm correct orthology assignments, phylogenetic trees were built using a broader 
sampling of protein sequences from both the query species and the three target 
species. Ambiguously aligned sequences (especially due to indels) were checked 
manually and corrected or removed. 
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Extended Data Figure 1 | Number of genes expressed in five tissues of Z. marina. a, Venn diagram of genes with expression values (FPKM) higher 
than 1 are considered as expressed in the tissue. b, Pairwise differential gene expression analysis between tissues. The male flower shows the highest 
number of differentially expressed genes. 
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Extended Data Figure 2 | Circos plot of the ten largest scaffolds of Z. marina. Tracks from outside to inside. GC percentage, gene density, and transposable 
element (TE) density (density measured in 20-Kb sliding windows and gene expression profiles from five tissues (root, leaf, male flower, female flower early 
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Extended Data Figure 3 | Potential impact of transposable elements 
(TEs) on Z. marina evolution. a, Frequency distribution of pairwise 
sequence identity values between copies of Copia- and Gypsy-type LTR 
retrotransposons and DNA transposons, and their cognate consensus 
sequences (younger repeats share higher sequence similarity). Two peaks 
are detectable for Copia-type elements. b, Distance to the closest TE for 
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the set of Z. marina single-copy genes and the set of Z. marina accessory 
genes. TE-proximal accessory genes are more frequent than TE-proximal 
single-copy genes. c, Frequency of pairwise sequence identity between 
accessory gene-proximal Ty3-Gypsy elements and their cognate consensus 
sequences. A number of high-identity copies (that is, putatively young 
duplicate genes) is observed. 
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Extended Data Figure 4 | Unrooted maximum likelihood tree of genes encoding light-harvesting complex A (LHCA) and LHCB proteins of Z.marina, 
Spirodela polyrhiza and Arabidopsis thaliana. The analysis was carried out on protein sequences using PhyML 3 with LG substitution model and 100 bootstrap 
replicates. Supplementary Note 7.1, Supplementary Table 7.3. 
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Extended Data Figure 5 | Alignment of metallothionein (MT) and 
half-metallothionein (HMT) genes in Z. marina as compared with 
other plants. Alignments were performed in ClustalW on the Lyon PBIL 
web server and edited manually. The upper alignments are for type 1-3 
MTs and HMTs; the lower alignment is for type 4 EcMTs where there 

is no Zostera homologue. Conserved residues are shown in red and 
residues in the same amino acid group in blue. Cys and His residues, 
putatively involved in binding metals, are highlighted in green and yellow, 


respectively. Aromatic amino acids absent in canonical animal MTs 

are highlighted in grey. MTs and MT-like proteins were obtained from: 
Arabidopsis thaliana (ARATH), Japanese rice (ORYSJ), Cicer arietinum 
(CICAR), banana (MUSAC), wheat (WHEAT), potato (SOLTU), Setaria 
Italica (SETIT), Vitis vinifera (VITVI) and the alismatids: Posidonia 
oceanica (POSOC) highlighted in grey, Spirodela polyrhiza (SPIPO) 
highlighted in blue, and Zostera marina (ZOSMA) highlighted in yellow. 
See Supplementary Note 8.2. 
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Extended Data Figure 6 | Conceptual summary of physiological and light and CO} availability shown in white within light-green boxes. 
structural adaptations made by Z. marina in its return to the sea. Gene losses and gains associated with morphological and physiological 
Ecosystem services shown in blue. Physical processes related to salinity, processes shown in white within the dark-green box on the right. 
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Extended Data Table 1 | Genes involved in stomata development in Z. marina compared to other angiosperms 


Gene Name Symbol A. thaliana O. sativa S. polyrhiza Z. marina 
Differentiation Genes 
SPEECHLESS SPCH At5g53210 ee NF-1 
9 $p6G0039300 
MUTE MUTE At3g06120 0s05g51820 NF-1 
FAMA FAMA At3g24140 0s05g50900 NF-1 
SCREAM / ICE1 SCRM At3g26744 0Os11932100 $p4G0062100 Zm11g00170 
SCREAN2 / ICE2 SCRM2 At1g12860 Os01g70310 Sp0G0129300 NF-1 
FOUR LIPS FLP Attg14350 
ae. cre Bd Bees 0807943420 Sp0G0157900 NF-1 
Spacing & Patterning Genes 
Zm87g00130 
ERECTA ER At2g26330 Os06g10230  SptsG0047400 FMEA 
ERECTA-LIKE1 ERL1 At5g62230 
alee bee deat Sa neoried Os06g03970 - Sp11G0029800 ——- Zm85g01030 
TOO MANY MOUTHS TMM At1g80080 0s01943440 -- Sp18G0010300 NF-2 
STOMATAL DENSITY 
SER CEUTGS spD1 Attg04110 0s03g04950 $p1G0013100 NF-1 
CO2 RESPONSE 
SECRETED CRSP Atig20160 0s09930458 $p3G0019800 Zm58g00010 
PROTEASE 
EPIDERMAL 
PATTERNING EPF1 At2g20875 
FACTOR1 0s04g54490 -« Sp14G0058800 NF-1 
EPIDERMAL 0s04938470 —-« Sp15G0006400 NF-1 
PATTERNING EPF2 Atig34245 
FACTOR2 
ee ene EPFL9 At4g12970 0s01968598 $p7G0057500 NF-1 
CHALLAH/EPF-LIKE6 CHAL/EPFL6 At2g30370 Oshiigeoodn $p29G0014100  Zm270g00140 
9 0s05g39880 P 

CHAL-LIKE4/EPF- 
LIKES CAEEErrIS Bygeene’ 0s03g06610 $p2G0017500 Zm95g00050 
CHAL-LIKEZIEPF- cenueanees aatare Os11937190 Sp24G0023900 _ Zm289g00040 
Polarity & Division Asymmetry Genes 
BREAKING OF 
ASYMMETRY IN THE BASL At5g60880 NC* NC* NC* 
STOMATAL LINEAGE 
PANGLOSS1 PAN1+ pretines 0s08g39590 $p12G0035200 _ Zm293g00080 

$p32G0009300 _ Zm30g00950 
PANGLOSS2 PAN2+ At4g20940 0s07g05190 a see 
POLAR 
LOCALIZATION 
DURING POLAR At4g31805 0s02g55190 Sp10G0014700 —_Zm16g01600 
ASYMMETRIC 9 9 P 9 
DIVISION AND 
REDISTRIBUTION 
Cytokinesis Genes 
STOMATAL 
CYTOKINESIS SCD1 Atig49040 Os01g39380  Sp21G0025200 _ SPANNER 
DEFECTIVE 1 9 


LETTER 


The genes documented to be involved in stomatal development in Arabidopsis®® were used as queries to find orthologues in rice and Siprodela polyrhiza (duckweed). See Supplementary Note 5.1, 
Supplementary Fig. 5.1 for sequence alignment and phylogenetic tree. NF-1, not found, supported by phylogeny; NF-2, not found, unambiguous reciprocal BlastP; NC, not conserved. 


*BASL is not evolutionarily conserved, precluding the finding of its homologue in monocots, if it would exist. 
TPAN genes have been searched for using the documented PAN1 and PAN2 genes from maize as baits. 
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Extended Data Table 2 | Ethylene-responsive transcription factor genes (ERF) in Zostera marina 


Gene Family A. thaliana O. sativa S. polyrhiza Z. marina Tissue expression in Z. marina (FPKM) 
ACS / ACSL FFE FFL MF R L 
AtACS1 

=» AtACS2 OsACS2 NF NF-4 

2a AtACS6 

oo 

22 OsACS5 

£> AtACS7 OsACS6 NF NF-1 

2 OsACS7 

Os 

of AtACS4 

2% AtACS5 

€€ AtACS8 OsACS1 $p24g0002100 NF-1 

a ALACS9 


AtACS11 


ACO FFE. «OC FFLSOUMF R L 
@ Os06g37590 
i= a 
Ee AtACO1 Osoteageeo —_SP2390011700 NF-1 
Slo 
BRS Sud cca 0s02953180 
“oad AtACO3 Os09g27750 NF-1 NF-1 
8 8 ° AtACO4 Os09g27820 
c= 
= 0s05g05680 
E AtACO5 pare are NF-1 NF-1 
ETR, ERS, EIN4 FFE EFL. MF R L 
AtETR1 0s03g49500  Sp6G0049300 
x NF-1 
af Menai 0s05906320  Sp22g0015200 
=a 
22 AtEIN4 Os04g08740 
£06 g' 
ig ALETR2 0s02957530 astaeiaae NF-1 
AtERS2 0s07g915540 peog 
CTR, EIN2 & Co FFE). FFL... ME R L 
AtCTR1 Os02932610 — s,990009700 NF-1 
NF-1 224 | 
AtEIN2 Ososeaeaoo: 9890029200 NF-1 
AtRTE1 Os01951430 54490010800 NF-1 


1g5852 
Os02g07630 
Os06945500 


$Sp8g0019500 Zm56g01580 10.8 16.7 15.2 72 35 


AtEIN3 0s07g48630 


Signaling genes and interacting partners 


$p3g0015900 Zm44g00270 4 19.8 115 124 70 
Os03g20780 
Pree 0303920790 $P090106000 _Zm140g00280 2 0.4 0.8 0 2.7 
near —Oueansgo Spaenoticd Nr 
9 $p27g0021200 
AtXRN4 Os03g58060___Sp2g0012600 _—Zm177g00170 15.6 = 10.3 13.4. 16.9 ~——-20.2 


MF, male flowers; FFE, female flowers early; FFL, female flowers late; R, roots; L, leaves; NF-1, not found as supported by reciprocal Blast and phylogeny. See Supplementary Note 6.1, Supplementary 
Fig. 6.1 for sequence alignment and phylogenetic tree. Grey indicates genes not involved in ethylene biosynthesis and signal pathways but strongly co-expressed, indicative of multiple functions. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Table 3 | Genes involved in pollen development of Z. marina compared to other angiosperms 


Gene Name Symbol A. thaliana O. sativa S. polyrhiza Z. marina Tissue expression in Z. marina (FPKM) 
FFE FFL MF R L 


_ SYN 
LESS ADHERENT 


LAP3 At3g59530 Os03g15710 $p16g0030000 NF-1 


TETRAKETIDE a- 
PYRONE 
REDUCTASE 2 


TKPR2 
(CCRLB) 


At1g68540 Os01g03670 NF NF-1 


NA NA NA NA 


TYPE Ill LIPID At5g62080 


TRANSFER LTP3 ——_At5g07230 eeacated Sp16g0007800 NF-1 

PROTEINS At5g52160 9 

GA+egulated Myb-ike CAMYB  ptza1 4440 

Blea tel Sees Aezosi00 0807959660 $p2290020200_Zmégd0090 88S BG HD B43 
FLP4 009925850 

pipcieercai nea (ERC3, _AtSg57800 ~ Os02g08230 —-« Sp5g0009000 NF-1 

WAX2) 006944300 

Pana INP1  At4g22600 ~—«Os02g44250 —»-S$p13g0036900 NF-2 

GLYCOSYLTRANSFE Atigi9710 

a cT1 AS 7eaz0 080115780 Sp14G0031700 _Zmedgd044O 10.1 1 ANT 224.1 62 

CYSTEINE 0s08g44270 -- Sp4g0036900 : 

ENDOPEPTIDASE 1 CEP1 —At5g50260 511914900 Sp11g0013300 mer 

MALE STERILE 188 ees) At5g56110  Os04g39470 -« Sp4g0087200 —«Zm262g00100 59 «446s 42S (iti ts«éB 

Peay Box FATB —At1g08510 —-Os06g05130 —- Sp21g0008900 —«Zm1g01370 «115.1. 104.8 «213.7 1293 «1243 


Thioesterase B 
Glycosyl transferase 
family GT31 

NO EXINE 
FORMATION 


The five genes encoding proteins associated on the ER-located sporopollenin metabolon in Arabidopsis®’ are highlighted in grey. The genes documented to be involved in pollen development in 
Arabidopsis or in rice were used as queries to find orthologues. MF, male flowers; FFE, female flowers early; FFL, female flowers late; R, roots; L, leaves; NF-1, not found, supported by reciprocal Blast 
and phylogeny; NF-2, not found, single copy gene; amb, ambiguous with homologues too similar to point to a specific orthologue. See Supplementary Note 11.1, Supplementary Fig. 11.1 for sequence 
alignment and phylogenetic tree; Supplementary Table 11.1 for complete gene list. 
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Polygenic evolution of a sugar specialization 


trade-off in yeast 


Jeremy I. Roop!, Kyu Chul Chang? & Rachel B. Brem!? 


The evolution of novel traits can involve many mutations scattered 
throughout the genome’. Detecting and validating such a suite of 
alleles, particularly if they arose long ago, remains a key challenge in 
evolutionary genetics!~*. Here we dissect an evolutionary trade-off 
of unprecedented genetic complexity between long-diverged species. 
When cultured in 1% glucose medium supplemented with galactose, 
Saccharomyces cerevisiae, but not S. bayanus or other Saccharomyces 
species, delayed commitment to galactose metabolism until glucose 
was exhausted. Promoters of seven galactose (GAL) metabolic genes 
from S. cerevisiae, when introduced together into S. bayanus, largely 
recapitulated the delay phenotype in 1% glucose-galactose medium, 
and most had partial effects when tested in isolation. Variation 
in GAL coding regions also contributed to the delay when tested 
individually in 1% glucose-galactose medium. When combined, 
S. cerevisiae GAL coding regions gave rise to profound growth 
defects in the S. bayanus background. In medium containing 
2.5% glucose supplemented with galactose, wild-type S. cerevisiae 
repressed GAL gene expression and had a robust growth 
advantage relative to S. bayanus; transgenesis of S. cerevisiae GAL 
promoter alleles or GAL coding regions was sufficient for partial 
reconstruction of these phenotypes. S. cerevisiae GAL genes thus 
encode a regulatory program of slow induction and avid repression, 
and a fitness detriment during the glucose-galactose transition but 
a benefit when glucose is in excess. Together, these results make clear 
that genetic mapping of complex phenotypes is within reach, even 
in deeply diverged species. 

A central goal of evolutionary genetics is to understand how 
organisms acquire phenotypic novelties. Such traits, if they have 
evolved over long timescales, can have a genetic basis quite dis- 
tinct from those arisen more recently‘. In landmark cases, single 
genes underlying species differences have been pinpointed and 
validated®, but the polygenic architecture of ancient traits has 
remained a mystery. 


In hybrids formed by mating S. cerevisiae with other Saccharomyces 
species®, we noted a pattern of coherent cis-regulatory variation in 
the seven genes of the galactose metabolic pathway. During growth 
in medium with glucose as the sole carbon source, the S. cerevisiae 
allele at each GAL gene conferred low expression relative to other 
Saccharomyces, except for the repressor GAL80, at which the S. cerevisiae 
allele drove expression up (Fig. 1b). Likewise, purebred S. cerevisiae 
expressed GAL effectors at low levels in glucose, and GAL80 at high 
levels, relative to other species (Fig. 1b and ref. 7). S. paradoxus, the 
sister species to S. cerevisiae, had an intermediate expression phenotype 
(Fig. 1b). Thus, the S. cerevisiae GAL program is one of heightened glu- 
cose repression relative to other species, as a product of cis-regulatory 
changes at the five loci that encode the seven GAL genes. Because such 
a pattern is unlikely under neutrality’, these data raised the possibility 
that selective pressure on the GAL pathway had changed along the 
S. cerevisiae lineage. 

In S. cerevisiae, pre-expression of metabolic genes in glucose 
medium can boost fitness upon a switch to other carbon sources”. 
We therefore expected that GAL expression divergence in glucose 
could have phenotypic correlates in other conditions. Culturing cells 
in 1% glucose-galactose medium, we observed a qualitative distinction 
between species (Fig. 2a). In S. cerevisiae, growth was retarded by a 
diauxic lag midway through the time course, reflecting the expected 
delay in assembling galactose metabolic machinery once glucose 
is exhausted”!°!?, In more distantly related yeasts, we observed no 
lag in 1% glucose-galactose medium supplemented with galac- 
tose (Fig. 2a, b), although S. paradoxus had a modest lag (Fig. 2a, b) 
that echoed its intermediate regulatory phenotype (Fig. 1b). Glucose 
mixtures with maltose and raffinose engendered a lag in all members 
of the clade (Extended Data Fig. 1). S. cerevisiae strains from distinct 
populations all exhibited a lag in glucose-galactose cultures (Fig. 2c). 
These data highlight S. cerevisiae as an extreme among Saccharomyces 
with respect to two attributes of galactose metabolism: reduced GAL 
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Figure 1 | Polygenic cis-regulatory evolution among yeast species in 
galactose metabolic genes. a, Phylogenetic tree of Saccharomyces species 
studied here’. S. bay, S. bayanus; S. mik, S. mikatae; S. par, S. paradoxus; 
S. cer, S. cerevisiae. b, Each cell reports expression, as a ratio between the 
indicated species and S. cerevisiae, of the indicated galactose metabolism 


gene during culture in glucose medium*. Total, expression measured in 
purebred species; cis, expression from the indicated species allele in a 
diploid hybrid between this species and S. cerevisiae, reflecting effects of 
cis-regulatory divergence. 
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Figure 2 | Diauxic lag, in 1% glucose-galactose medium, is conserved 
within S. cerevisiae and divergent among species. a, Growth of 
Saccharomyces type strains inoculated into medium containing 1% 
glucose and 1% galactose (n=6). S. cas, S. castellii. b, Each bar reports the 
geometric mean of the growth rate (GMR) of the indicated species from 
the time course in a, normalized to the analogous quantity in glucose 
medium. c, Growth of S. cerevisiae isolates (blue) from the indicated 
populations (W/E, Wine/European; WA, West African; NA, North 


gene expression during growth in pure glucose, and diauxic lag in 1% 
glucose-galactose. 

To dissect further the divergence in galactose metabolic behaviours, 
we focused on a comparison of S. cerevisiae with its distant relative S. 
bayanus var. uvarum (S. bayanus). In 1% glucose-galactose medium, 
both species initially metabolized glucose with similar rates, indicating 
that neither used the sugars simultaneously (Fig. 2d). In S. bayanus cul- 
tures, galactose consumption began at a point just before the complete 
exhaustion of glucose. For S. cerevisiae, glucose exhaustion triggered 
the diauxic lag, during which galactose levels in its culture medium 
were largely unchanged. After the lag, with the eventual resumption 
of log-phase growth by S. cerevisiae, galactose levels finally dropped 
(Fig. 2d). These results implicate the transition between glucose and 
galactose metabolism as a nexus of phenotypic differences between 
the species. 

For direct tests of the phenotypic impact of divergence at the GAL 
genes, we replaced GAL gene sequences in one species by those of the 
other at the endogenous loci (Fig. 3a). In a first investigation of GAL 
promoters, S. cerevisiae alleles of the regions upstream of GAL1, GAL3, 
GAL4, and GAL10 were each sufficient for a partial gain in diauxic lag 
in S. bayanus, in 1% glucose-galactose medium (Fig. 3b, c). Control 
experiments established the inverse effect of S. bayanus GAL promoter 
alleles, reducing lag in the S. cerevisiae background (Extended Data 
Fig. 2). 

We next aimed at a more complete reconstruction of S. cerevisiae-like 
galactose metabolic behaviours, which we inferred to be derived, in S. 
bayanus as a representative of the likely ancestral state. An S. bayanus 
strain harbouring all seven GAL promoters from S. cerevisiae recapit- 
ulated 69% of the lag phenotype of the S. cerevisiae parent (Fig. 3b, c), 
with GAL gene expression peaking at the same time point as that of 
wild-type S. cerevisiae and at similar amplitude (Fig. 3d). Comparison 
with the sum of lag effects from individual promoter swaps revealed 


Time (h) 
American) and the S. bayanus type strain (black), inoculated into medium 
containing 1% glucose and 1% galactose (n = 6). d, Growth (solid lines, 
n=A4) of S. cerevisiae and S. bayanus inoculated into medium containing 
1% glucose and 1% galactose, and medium concentrations of glucose and 
galactose (dotted and broken lines, respectively; n = 2 biological replicates, 
each comprising 3 technical replicates). Error bars, s.e.m. Each set of data 
is representative of the results of two independent experiments. 


negative epistasis in the seven-promoter replacement strain (Fig. 3c), 
and in strains harbouring intermediate S. cerevisiae promoter combi- 
nations (Extended Data Fig. 3). 

Transgenesis of individual S. cerevisiae GAL coding regions was also 
sufficient for a partial lag in S. bayanus, in the case of GAL1, GAL2, 
GAL3, GAL4, GAL7, and GAL10 (Fig. 3b, c). Swaps of GAL promoter- 
coding fusions revealed negative epistasis at GAL1 and GAL3: for these 
genes, the sum of phenotypes from the respective promoter and coding 
transgenics was far more dramatic than the effect of the promoter-coding 
fusion (Fig. 3c). Combining all seven S. cerevisiae GAL coding or promoter- 
coding regions in S. bayanus, we observed an exaggerated, long-term 
growth delay in 1% glucose-galactose medium, distinct from the tem- 
porary lag of wild-type strains and promoter transgenics (Fig. 3b). This 
defect reflected dysfunction of multiple modules of the S. cerevisiae 
GAL pathway in S. bayanus, as it could be elicited by just the two reg- 
ulators Gal3 and Gal4 swapped from S. cerevisiae, or just S. cerevisiae 
alleles of the enzymes Gall, Gal7, and Gal10 (Extended Data Fig. 3). 
Coding and promoter-coding swap strains did ultimately resume active 
growth (Fig. 3b) and metabolize galactose from mixed-sugar medium 
(Extended Data Fig. 4), and their GAL expression induction was mark- 
edly delayed (Fig. 3d). These strains also grew poorly in pure galactose 
medium (Extended Data Fig. 5). Together, our data make clear that 
diauxic lag in 1% glucose-galactose medium can be largely recapit- 
ulated by divergent GAL gene promoters; GAL protein alleles from 
S. cerevisiae make a partial contribution to lag when tested in isolation 
and, when combined in S. bayanus, confer growth defects far exceeding 
those of either wild-type. 

In light of the conservation of diauxic lag across S. cerevisiae (Fig. 2c), 
we hypothesized that this species had maintained its divergent 
galactose metabolic behaviour on the basis of a fitness benefit. Among 
the potential mechanisms for such an advantage, we focused on the 
possibility that as S. cerevisiae represses GAL genes in glucose-replete 
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Figure 3 | S. cerevisiae alleles of GAL genes confer diauxic lag in 

1% glucose-galactose medium. a, Replacement of S. cerevisiae GAL 
sequences into S. bayanus at the endogenous loci. b, Growth of S. bayanus 
harbouring S. cerevisiae alleles of a single GAL gene, or of all seven 

genes, inoculated into medium containing 1% glucose and 1% galactose 
(n> 12). ¢, Each bar reports the ratio of the GMR over the time course 

of the indicated strain in b to that of wild-type S. bayanus, subtracted 
from 1; negative values are GMRs faster than wild-type. Error bars, s.e.m. 


conditions (Fig. 1), it avoids the liability of expressing unused pro- 
teins and enables rapid growth!®!%. When cultured in 2.5% glucose 
medium also containing galactose, wild-type S. cerevisiae exhibited 
a 10% faster growth rate (Fig. 4a, b), and fourfold to ninefold lower 
expression of GAL enzymes (Fig. 4c), than S. bayanus. Both spe- 
cies metabolized glucose almost exclusively across the time course 
(Fig. 4d, e). Replacement of all seven S. bayanus GAL promoters with 
S. cerevisiae alleles recapitulated the program of low GAL gene expres- 
sion (Fig. 4c), and conferred a growth rate halfway between those of 
the wild-type species (Fig. 4a, b), in 2.5% glucose medium supple- 
mented with galactose. The S. bayanus strain harbouring all seven 
S. cerevisiae GAL coding regions also expressed GAL genes at low levels 
(Fig. 4c), which mirrored this strain’s exaggerated delay in GAL gene 
induction (Fig. 3b-d), and was associated with a partial growth benefit 
(Fig. 4a, b). Promoter-coding replacement conferred no additional 
phenotype over and above the effects of transgenesis of either region 
type alone (Fig. 4a—c). We conclude that S. cerevisiae GAL promot- 
ers, by shutting down expression of the galactose metabolic pathway, 
are adaptive in conditions of abundant glucose, and this program can 
be phenocopied by S. cerevisiae GAL protein alleles in S. bayanus. 
Sequence analyses revealed a high ratio of inter-specific divergence to 
intra-species polymorphism in GAL gene promoters, and not in GAL 
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Asterisks, significant differences (P < 0.001, Wilcoxon rank-sum) from 
wild-type S. bayanus. Also shown are expected phenotypes of promoter- 
CDS transgenics for a single gene (horizontal lines) or seven-locus 
transgenics (circles on y axis), under an additive model of contributions 
from the regions combined in the respective strains. d, GAL gene 
expression at time points indicated in the final panel of b (n > 2 biological 
replicates, each comprising 3 technical replicates). Each set of data is 
representative of the results of two independent experiments. 


coding regions (Extended Data Tables 1-3), suggestive of a history of 
directional evolution at these loci. 

In this work, we have dissected glucose-specialist phenotypes that 
distinguish S. cerevisiae from other members of the Saccharomyces 
clade. S. cerevisiae is reluctant to transition from glucose to galactose 
metabolism, and has a growth advantage in a high-glucose environment. 
Additionally, the S. cerevisiae program confers an increase in biomass 
accumulation during growth in pure galactose (Extended Data Fig. 5c) 
and could be beneficial when glucose availability fluctuates rapidly'!. As 
S. cerevisiae alleles of GAL gene promoters are largely sufficient for this 
family of traits, they may have served as an easily evolvable, and prob- 
ably adaptive, origin of these characters. By contrast, the S. cerevisiae 
GAL proteome, which confers synthetic growth defects in modern-day 
S. bayanus, may have evolved slowly over a rugged fitness landscape, 
under distinct forces or at a different period. Such a model would dove- 
tail with the cis-regulatory basis of a related, but genetically simple, 
galactose metabolism trait that evolved more recently between yeasts’. 
For any suite of divergent regulatory regions, observing cis-acting effects 
on gene expression can open a first window onto their phenotypic 
relevance and that of the gene products they control. With this strategy, 
evolutionary biologists need not be limited by polygenicity in the map- 
ping of genotype to phenotype, even between long-diverged species. 
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Figure 4 | S. cerevisiae GAL alleles confer a fitness advantage in 2.5% 
glucose medium supplemented with galactose. a, Growth of S. bayanus 
strains harbouring S. cerevisiae alleles of all seven GAL genes, and the 
wild-type species, inoculated into medium containing 2.5% glucose and 
10% galactose. Inset shows the complete time course from which the main 
figure shows a narrower time window (n > 136, multiple independent 
experiments pooled). b, Each bar reports the difference in maximum 
growth rate between the indicated species and wild-type S. bayanus over the 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 27 March; accepted 18 December 2015. 
Published online 10 February 2016. 


1. Orr, H. A. The genetic theory of adaptation: a brief history. Nature Rev. Genet. 6, 
119-127 (2005). 
2. Rockman, M. V. The QTN program and the alleles that matter for evolution: 
all that’s gold does not glitter. Evolution 66, 1-17 (2012). 
3. Pritchard, J. K., Pickrell, J. K. & Coop, G. The genetics of human adaptation: 
hard sweeps, soft sweeps, and polygenic adaptation. Curr. Biol. 20, R208-R215 
(2010). 
4. Savolainen, O., Lascoux, M. & Merila, J.; Savolainen. Ecological genomics of 
ocal adaptation. Nature Rev. Genet. 14, 807-820 (2013). 
5. adeau, N. J. & Jiggins, C. D. A golden age for evolutionary genetics? 
Genomic studies of adaptation in natural populations. Trends Genet. 26, 
484-492 (2010). 
6. Schraiber, J. G., Mostovoy, Y., Hsu, T. Y. & Brem, R. B. Inferring evolutionary 
histories of pathway regulation from transcriptional profiling data. PLoS 
Comput. Biol. 9, €1003255 (2013). 
7. Caudy, A. A. et al. Anew system for comparative functional genomics of 
Saccharomyces yeasts. Genetics 195, 275-287 (2013). 
8. Bullard, J. H., Mostovoy, Y., Dudoit, S. & Brem, R. B. Polygenic and directional 
regulatory evolution across pathways in Saccharomyces. Proc. Nat! Acad. 
Sci. USA 107, 5058-5063 (2010). 
9. Venturelli, O. S., Zuleta, |., Murray, R. M. & El-Samad, H. Population 
diversification in a yeast metabolic program promotes anticipation of 
environmental shifts. PLoS Biol. 13, e1002042 (2015). 


time course of the indicated strain in a. Error bars, s.e.m.; asterisks indicate 
rates significantly different (P< 1 x 10-8, Wilcoxon rank-sum) from wild- 
type S. bayanus. c, Each row reports GAL gene expression of the indicated 
strain at time points indicated by the arrow in a (n > 2 biological replicates, 
each comprising 3 technical replicates). d, e, In each panel, the first bar 
reports sugar concentration in medium before inoculation; the remaining 
bars report sugar concentrations in medium after the indicated time course 
in a (n =3 biological replicates, each comprising 3 technical replicates). 


10. Wang, J. et al. Natural variation in preparation for nutrient depletion reveals a 
cost-benefit tradeoff. PLoS Biol. 13, e1002041 (2015). 

11. New, A. M. et a/. Different levels of catabolite repression optimize growth 
in stable and variable environments. PLoS Biol. 12, e1001764 
(2014). 

12. De Deken, R. H. The Crabtree effect: a regulatory system in yeast. J. Gen. 
Microbiol. 44, 149-156 (1966). 

13. Peng, W., Liu, P, Xue, Y. & Acar, M. Evolution of gene network activity by tuning 
the strength of negative-feedback regulation. Nature Commun. 6, 6226 
(2015). 

14. Scannell, D. R. et al. The awesome power of yeast evolutionary genetics: 
new genome sequences and strain resources for the Saccharomyces sensu 
stricto genus. G3 1, 11-25 (2011). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements This work was supported by National Institutes of Health 
GM087432 to R.B.B. and a Hellman Graduate Fellowship from the University 
of California Berkeley to J.I.R. We thank A. Arkin for his advice and resources; 
J. Rine for yeast strains; O. Venturelli for technical expertise; and A. Flamholz, 
J. Schraiber, and P. Shih for discussions. 


Author Contributions J.|.R. and R.B.B. designed experiments, J.I.R. and K.C.C. 
conducted experiments, J.I.R. analysed data, and J.I.R. and R.B.B. wrote the 
paper. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial 
interests. Readers are welcome to comment on the online version of the paper. 
Correspondence and requests for materials should be addressed to 

R.B.B. (rbrem@buckinstitute.org). 


18 FEBRUARY 2016 | VOL 530 | NATURE | 339 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


METHODS 


No statistical methods were used to predetermine sample size. The experiments 
were not randomized. The investigators were not blinded to allocation during 
experiments and outcome assessment. 

Yeast strains. Strains used in this study are listed in Supplementary Table 1. 
Abbreviations in figures and tables are as follows: S. cer, Saccharomyces cerevi- 
siae; S. par, Saccharomyces paradoxus; S. mik, Saccharomyces mikatae; S. bay, 
Saccharomyces bayanus; S. cas, Saccharomyces castellii. Allele-swap strains con- 
structed in haploid S. bayanus JRY294 or JRY296 (isogenic MATa and MATa 
derivatives of type strain CBS7001) used the MIRAGE method"? with several mod- 
ifications as follows. A 1.7-kilobase region containing the K. lactis URA3 coding 
sequence and regulatory region was amplified from the pCORE-UH plasmid!° and 
used for each half of the inverted repeat. The S. cerevisiae GAL region to be swapped 
in was attached to one half of the inverted repeat cassette by overlap extension PCR, 
after which the two halves of the final cassette were ligated together. Owing to the 
different sizes of the two halves of the inverted repeat cassette, a second restriction 
digestion step as described in ref. 15 was not necessary to remove non-desired 
ligation products. Transformation with the cassette, followed by confirmation of 
positive transformants and plating onto 5-FOA medium, resulted in excision of 
the inverted repeat from the target genome, leaving behind a marker-less allele 
swap at the locus. Sanger sequencing was used to verify the correct nucleotide 
sequence of each swapped allele. The S. cerevisae allele of each promoter and CDS 
was amplified from genomic DNA of YHL068 (ref. 6). Promoter, coding, and pro- 
moter-coding fusion swap strains were engineered by replacing 600 base pairs (bp) 
of intergenic region directly upstream of the CDS, the CDS, or these two regions 
combined, respectively, in S. bayanus with orthologous regions from S. cerevisiae. 
For allele swaps in the S. cerevisiae background in Extended Data Fig. 2, 720 bp of 
the region between the GALI and GAL10 open reading frames were amplified from 
S. bayanus strain CBS7001 and used to construct a MIRAGE cassette as above, and 
transformed into S. cerevisiae strain JRY313 (isogenic MATa derivative of BY 4743) 
and selected as above. For each transgenic, two or more independent transformants 
were used as replicates for growth profiling and sugar concentration measure- 
ments. Combining unlinked allele swaps into a single genome was accomplished by 
single-cell mating of single-locus swaps, followed by sporulation, tetrad dissection, 
and diagnostic PCR to identify segregant colonies with the allele combinations 
of interest. S. cerevisiae alleles of GAL1, GAL7 and GAL10 were combined in the 
S. bayanus background by successive allele swap transformations. 

Growth curves and quantification. All growth experiments were conducted at 
26°C in YP media (2% bacto-peptone, 1% yeast extract) supplemented with var- 
ious carbon sources as follows. Experiments measuring diauxic lag in 1% glucose 
medium supplemented with galactose used medium containing 1% glucose and 
1% galactose. Experiments measuring growth profiles in other non-galactose car- 
bon sources used media containing 1% glucose and 1% of the secondary carbon 
source as indicated. Experiments measuring maximum growth rates in high- 
glucose media containing galactose used medium containing 2.5% glucose and 
10% galactose. Experiments measuring growth profiles in pure galactose medium 
used media containing 2% galactose. 

Growth time courses were performed as follows. Strains were grown in YP 
containing 2% galactose (Fig. 4 and Extended Data Fig. 5) or 2% glucose (all other 
figures) for 24h with shaking at 200 r.p.m. Each strain was then back-diluted into 
the same medium to an absorbance (A) of 0.1 and grown for an additional 6h. 
These log-phase cultures were then back-diluted to an absorbance of 0.02 in a 
96-well plate containing 150 il of YP with the appropriate amount of a given carbon 
source. Plates were covered with a gas-permeable membrane, placed in a Tecan 
F200 plate reader and incubated with orbital shaking for the duration of each 
experiment. Measurements of Agoonm Were made every 30 min. 

Gain in lag was calculated from growth curves as (1 - (GMR ofa given strain)/ 
(GMR of S. bayanus). GMR was calculated as in ref. 11, with the following 
differences. A window of 0.1-0.8 absorbance units was used for quantification 
in all figures apart from Fig. 2b. In Fig. 2b, GMR was calculated within a window 
bounded by 20% and 80% of the maximum final yield attained during the time 
course for each species. Maximum growth rate was calculated as in ref. 10 except 
that a window of 0.01-0.3 absorbance units was used and a geometric mean of 
growth rates was calculated. Final growth yield was calculated as the difference 
between initial and final A¢o9 nm Measurements as in ref. 17. 

Replication schemes and analysis were as follows. To enable qualitative com- 
parisons among species in Fig. 2a-c, on a given 96-well plate, six biological rep- 
licate cultures of each strain were assayed. On each growth plate, replicate fitness 
values greater than two standard deviations from the mean of fitness values for 
that strain were considered artefacts of technical error and discarded. Displayed 
data for a given strain are the results of growth measurements from one plate and 
are representative of at least two plates cultured on different days. Experiments 


in Fig. 2d were as in Fig. 2a-c except that four biological replicate cultures were 
measured. 

To enable highly powered quantitative comparisons among strains of growth 
in 1% glucose-galactose medium in Fig. 3b, c, 12 biological replicate cultures of 
each transgenic strain were assayed across several plates and days, in each case 
alongside replicates of wild-type S. bayanus and S. cerevisiae. The growth rate of a 
given strain measured on a given plate was normalized to the value for wild-type 
S. bayanus on that plate, and these normalized measurements were then averaged 
across plates. Because we included wild-type strains on each plate, their growth 
measurements as displayed in Fig. 3b, c are averages over 78 and 54 replicates of 
S. bayanus and S. cerevisiae, respectively. Artefact filtering was as above. Differences 
in growth among strains were assessed for statistical significance by a two-sided 
Wilcoxon rank-sum test and a Bonferroni correction was applied in instances of 
multiple tests. 

To enable highly powered quantitative comparisons among strains of growth 
in medium containing 2.5% glucose and 10% galactose in Fig. 4a, b, we assayed 
growth of 250, 170, 204, 136, and 192 biological replicate cultures of wild-type 
S. bayanus, wild-type S. cerevisiae, the combinatorial promoter transgenic strains, 
the combinatorial promoter-coding transgenic strains, and the combinatorial 
coding transgenic strains, respectively, across several plates and days. As above, 
growth measurements for each strain in turn assayed on a given plate were nor- 
malized to the wild-type S. bayanus cultured on that plate, and normalized growth 
rate measurements were combined across plates. Artefact filtering and statistical 
testing were as above. 

Experiments in Extended Data Fig. 1 were as in Fig. 2a—c. Experiments for 

Extended Data Fig. 2 were as in Fig. 3 except that 12 replicate cultures were meas- 
ured for each strain. Experiments for Extended Data Fig. 3 were as in Fig. 3b, c. 
Experiments for Extended Data Fig. 4 were as in Fig. 2a-c. 
Sugar measurements. For growth time-course experiments in which sugar con- 
centration was measured, an appropriate number of wells (see below) were inocu- 
lated into a 96-well plate and cultured in the Tecan F200 plate reader as above, such 
that media from at least two replicate wells could be harvested at each time point of 
interest and at least four replicate cultures of each strain would remain untouched 
for growth curve analysis. Samples were taken for sugar measurement by cutting 
the membrane covering the 96-well plate with a razor and extracting all 15011 of 
cell culture in a given well. Care was taken only to puncture the membrane above 
the harvested well such that adjacent wells were not affected. Cells and debris were 
pelleted from the sampled culture by brief centrifugation and the supernatant was 
extracted for quantification of glucose and galactose. Glucose was measured using 
the GlucCell glucose monitoring system (Chemglass Life Sciences). Galactose was 
measured with the Amplex Red Galactose Oxidase assay kit (Molecular Probes, Life 
Technologies). For galactose measurements, a Tecan Safire plate reader was used 
to quantify fluorescence and the relationship between fluorescence and galactose 
concentration was determined using a standard curve. 

Data in Fig. 2d were obtained by sampling two biological replicate cultures 
at each time point. Data in Fig. 4d, e and Extended Data Fig. 4 were obtained by 
sampling three biological replicate cultures both at the start and at the endpoints of 
the growth time course. For all experiments, three technical replicates were assayed 
for each biological replicate culture sampled, and mean values are reported. All 
data are representative of two identical experiments conducted on different days. 
Quantitative PCR. Time point samples for quantitative PCR analysis were 
obtained from cultures analogously to those obtained for sugar consumption 
quantification detailed above. Between 2 and 15 replicate wells were harvested at 
each time point and pooled to have sufficient biological material for RNA isolation. 
RNA was isolated using an RNeasy mini kit (Qiagen) and complementary DNA 
was synthesized using SuperScript III (Life Technologies). DyNAmo HS SYBR 
green (Thermo Scientific) was used for quantitative PCR and all quantification was 
done on a CFX96 machine (BioRad). Gene expression levels relative to ACT 1 were 
calculated using the 2~44°" method!®. Three technical replicates per biological 
sample were assayed, and mean values are reported. Primer sequences and cycling 
times are in Supplementary Table 2. 

Sequence analyses. Custom Python scripts were used to extract coding sequences 
and 600 bp promoter regions for type strains of S. paradoxus, S. mikatae, and S. 
bayanus“ for each gene that had an orthologue in each of the five Saccharomyces 
sensu stricto species as reported in ref. 14. S. cerevisiae population sequences 
were downloaded from the following sources: YJM978, UWOPS83-787, Y55, 
UWOPS05-217.3, 273614N, YS9, BC187, YPS128, DBVPG6765, YJM975, L1374, 
DBVPG1106, K11, SK1, 378604X, YJM981, UWOPS87-2421, DBVPG1373, 
NCYC3601, YPS606, Y12, UWOPS05-227.2, and YS2 from http://www. yeastre. 
org/g2p/home.do; Sigma1278b, ZTW1, T7, and YJM789 from http://www. yeast- 
genome.org/; and RM11 from http://www.broadinstitute.org/annotation/genome/ 
Saccharomyces_cerevisiae. S. cerevisiae sequences were aligned to each of the other 
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three species in turn using FSA (ref. 19), with the “-nucprot’ option for the coding 
sequence alignments. For each set of species alignments, nucleotide replacement 
and polymorphic sites were tabulated using Polymorphorama” for coding regions 
and custom Python scripts for promoter regions. 

Sequence analyses of the seven genes of the GAL pathway (GAL1, GAL2, GAL3, 
GAL4, GAL7, GAL10, and GAL80) using S. paradoxus, S. mikatae, or S. bayanus 
as an outgroup were done as follows. For a given species comparison, we first cal- 
culated the neutrality index NIy¢ statistic?! for the promoter regions of the GAL 
genes using synonymous sites in the downstream gene as putative neutral sites, as 


¥ DgP,i / (Pai + Dai) 
Y BD, / (Bi + Dai) 


where i counts genes of the group, D, and D, denote the number of divergent 
synonymous and divergent promoter sites respectively, and P; and P, denote the 
number of polymorphic synonymous and polymorphic promoter sites respectively. 

Minor alleles with a frequency of less than 0.15 were ignored”? and all site 
counts were corrected for multiple hits using a Jukes-Cantor model. In computing 
NIrg for the seven genes of the GAL pathway, we considered that since GAL1 and 
GAL10 are adjacent genes on chromosome II sharing a 662 bp promoter region, 
simply counting sites within 600 bp promoter regions for each gene separately 
would have resulted in double-counting of 532 bp. To avoid this, we aligned the 
662 bp intergenic region of these two genes and considered it one locus in the 
NIrg calculation. Synonymous sites for this region were taken as the sum of such 
sites in both the GALI and GAL10 coding sequences. To evaluate significance, we 
generated 10,000 randomly chosen groups of six genes each and computed NItg 
across the promoters of each such null group as above. The empirical significance 
of the true NIyg value for GAL gene promoters was then taken as the proportion 
of null groups whose promoters gave an NIrg less than or equal to the true value. 
We assessed selection acting on non-synonymous sites in GAL coding sequences 
with a pipeline analogous to the above, except that GAL1 and GAL10 were con- 
sidered separate loci, as the coding sequences do not overlap; thus our resampling 
test used 10,000 random groups of seven coding regions each. Tests for selec- 
tion using the neutrality index statistic’ [(polymorphic non-synonymous sites/ 
polymorphic synonymous sites)/ (divergent non-synonymous/divergent synony- 
mous sites), with a similar formulation for non-synonymous sites] used the same 
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pipeline as above except that we tabulated the average neutrality index across genes 
of a group, and compared this average quantity for the GAL genes against resam- 
pled groups of the same size. 

Dxy (ref. 25), the average number of pairwise differences between outgroup 
species and S. cerevisiae strains, and intrapopulation nucleotide diversity*® (7), 
were calculated using custom Python scripts. Nucleotide diversity was ascertained 
on alignments containing only S. cerevisiae strains belonging to the European pop- 
ulation, the most deeply sampled in our data set. Empirical significance of these 
statistics for GAL gene promotes and coding sequences was calculated using the 
resampling pipeline described above. 

Code availability. Custom Python scripts used for data analysis are available upon 
request. 
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Extended Data Figure 1 | Divergence in the S. cerevisiae diauxic lag 1% glucose and 1% of the indicated secondary carbon source (n = 4, data 
trait is specific to growth in glucose-galactose medium. Each trace representative of two independent experiments). 


reports growth of the indicated yeast inoculated into medium containing 
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Extended Data Figure 2 | The S. bayanus allele of the GAL1 and GAL10 the GMR of the indicated strain over the time course shown in a. Error 
promoters confers partial rescue of diauxic lag in S. cerevisiae in 1% bars, s.e.m.; asterisk indicates a significantly different rate (P< 1 10°, 
glucose-galactose medium. a, Data are as in Fig. 3b, except that the Wilcoxon rank-sum) between the transgenic swap strain and wild-type 
yellow curve reports growth of an S. cerevisiae strain harbouring the S. cerevisiae. Each set of data is representative of the result of two 


GAL1 and GAL10 promoters from S. bayanus (n> 12).b, Each bar reports —_ independent experiments. 
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Extended Data Figure 3 | Epistasis between S. cerevisiae GAL gene the last bar, are from Fig. 3c; each of the remaining bars reports results 


alleles swapped into S. bayanus, in 1% glucose-galactose medium. Each _ from an S. bayanus strain harbouring S. cerevisiae alleles at the indicated 
bar reports the ratio of the GMR to that of wild-type S. bayanus, subtracted combination of GAL loci (n> 8, data representative of two independent 
from 1, over a time course of the indicated strain inoculated into medium experiments). Symbols and analyses are as in Fig. 3c. 

containing 1% glucose and 1% galactose. The first to fourth bars, and 
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Extended Data Figure 4 | S. bayanus strains harbouring S. cerevisiae 
alleles of all seven GAL loci, inoculated into 1% glucose-galactose 
medium, deplete growth media of galactose. The first bar reports 
galactose concentration in medium before inoculation, and the remaining 
bars report concentrations after growth time courses, in the experiments in 
Fig. 3b (n= 3 biological replicates, each comprising 3 technical replicates). 
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respective time course in a. Error bars, s.e.m.; asterisks indicate values 
significantly different (P < 0.05, Wilcoxon rank-sum) from wild-type 

S. bayanus. c, Each bar reports growth yield from the respective time 
course in a. Error bars, s.e.m.; asterisks indicate values significantly 
different (P < 0.05, Wilcoxon rank-sum) from wild-type S. bayanus. Each 
set of data is representative of the result of two independent experiments. 


Extended Data Figure 5 | In pure galactose medium, S. cerevisiae alleles 
of GAL loci are sufficient for increased biomass accumulation and, in 
the case of protein-coding regions, slow growth. a, Each trace reports 
growth of an S. bayanus strain harbouring S. cerevisiae alleles of all seven 
GAL genes, or a wild-type control, inoculated into medium containing 

2% galactose (n> 8). b, Each bar reports maximum growth rate from the 
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Extended Data Table 1 | Excess of divergence relative to polymorphism in GAL promoter regions 


Extended Data Table 1a 


Outgroup 


Statistic S. par S. mik S. bay 
Nltg GAL promoters 0.32 0.37 0.31 


Nltg genome promoters 0.72 0.91 0.73 
p-value 0.025 0.013 0.018 


Extended Data Table 1b 

Outgroup 
Statistic S. par S. mik S. bay 
Nltg GAL non-synonymous sites 0.98 0.92 1.03 
Nltq genome non-synonymous sites 1.28 1.21 1.27 
p-value 0.25 0.19 0.26 


Each panel reports analyses of the Nljg measure?! comparing polymorphism within S. cerevisiae with divergence between S. cerevisiae and the indicated outgroup species, taken across promoter sites 
(a) or non-synonymous coding sites (b), with normalization by the analogous measure from synonymous coding sites. In a given panel, the first row reports Nltg across the seven GAL genes, the second 
row reports the mean Nltg from 10,000 randomly drawn gene groups, and the bottom row reports empirical significance of the distinction between GAL genes and the genomic null. 
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Extended Data Table 2 | Divergence between species, and polymorphism within S. cerevisiae, at GAL loci 


Extended Data Table 2a 
Outgroup 


Dxy S. par S. mik So. bay 
GAL promoters 0.172 0.276 0.328 
genome promoters 0.15 0.241 0.281 
p-value 0.147 0.099 0.045 
GAL CDS 0.102 0.155 0.188 
genome CDS 0.096 0.153 0.190 


p-value 0.337 0.429 0.573 


Extended Data Table 2b 


1 promoters CDS 
GAL 0.00067 0.00069 
genome 0.00194 0.00113 
p-value 0.181 0.513 


a, The first line reports mean Dyy for GAL promoters in comparisons between S. cerevisiae and outgroup species. The second line reports the analogous statistics for the mean of 10,000 randomly 
drawn gene groups, and the third line reports empirical significance of the distinction between GAL genes and the genomic null. The fourth, fifth, and sixth lines are analogous to the above except that 
GAL coding regions were analysed. b, The first line reports 7 for GAL promoters or coding regions within S. cerevisiae. The second line reports the analogous statistics for the mean of 10,000 randomly 
drawn gene groups, and the third line reports empirical significance of the distinction between GAL genes and the genomic null. 
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Extended Data Table 3 | Evaluation of selection acting on S. cerevisiae promoter and coding sequences using the neutrality index statistic 


Extended Data Table 3a 


Outgroup 
Statistic S. par S. mik S. bay 
GAL promoters 0.32 0.42 0.35 
genome promoters 0.94 1.13 0.92 


p-value 0.003 0.009 0.014 


Extended Data Table 3b 


Outgroup 
Statistic S. par S. mik S. bay 
GAL non-synonymous sites 1.43 1.24 1.44 
genome non-synonymous sites 2.02 1.88 1.80 
p-value .30 22 .36 


Data are as in Extended Data Table 1 except that the neutrality index?* was used for each test. 
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Visualization of a short-range Wnt gradient in the 


intestinal stem-cell niche 


Henner F. Farin!?*-4, Ingrid Jordens°, Mohammed H. Mosa***, Onur Basak!, Jeroen Korving', Daniele V. F. Tauriello®, 
Karin de Punder®, Stephane Angers’, Peter J. Peters°, Madelon M. Maurice>* & Hans Clevers!* 


Mammalian Wnt proteins are believed to act as short-range 
signals'~*, yet have not been previously visualized in vivo. Self- 
renewal, proliferation and differentiation are coordinated along 
a putative Wnt gradient in the intestinal crypt®. Wnt3 is produced 
specifically by Paneth cells®’. Here we have generated an epitope- 
tagged, functional Wnt3 knock-in allele. Wnt3 covers basolateral 
membranes of neighbouring stem cells. In intestinal organoids, 
Wnt3-transfer involves direct contact between Paneth cells 
and stem cells. Plasma membrane localization requires surface 
expression of Frizzled receptors, which in turn is regulated by 
the transmembrane E3 ligases Rnf43/Znrf3 and their antagonists 
Lgr4-5/R-spondin. By manipulating Wnt3 secretion and by 
arresting stem-cell proliferation, we demonstrate that Wnt3 mainly 
travels away from its source in a cell-bound manner through cell 
division, and not through diffusion. We conclude that stem-cell 
membranes constitute a reservoir for Wnt proteins, while Frizzled 
receptor turnover and ‘plasma membrane dilution through cell 
division shape the epithelial Wnt3 gradient. 

Distinct mechanisms have been proposed as to how gradients of 
secreted Wnt protein control growth and patterning during animal 
development. Evidence exists for highly localized activity between 
adjacent cells'~°, as well as for long-range activity®®. Drosophila 
Wingless (Wg) is classically considered a morphogen in the wing 
imaginal disk, secreted from a stripe of cells to form a long-range 
concentration gradient!” This concept was recently challenged, 
when it was shown that membrane tethering of Wg to the producer 
cells did not perturb early development*. In vertebrates, limited infor- 
mation exists on endogenous Wnt protein distribution!?, and on how 
Wnt gradients are built and maintained. Transcriptional Wnt target 
gene activity has been well-characterized in the Wnt-dependent 
stem-cell compartment of intestinal crypts®. Expression of Wnt tar- 
get genes occurs in a gradient", with highest activity at the crypt 
bottom where Lgr5* stem cells'> are located between post-mitotic 
Paneth cells, that produce high levels of Wnt3 (refs 6, 7). Paneth- 
cell-derived Wnt3 is redundant with mesenchymally produced Wnts 
in vivo'®'7, In three-dimensional organoid culture, Lgr5 stem cells 
generate ever-growing, self-organizing ‘mini-guts’ with defined crypt- 
and villus-domains!®, a process for which Paneth-cell-derived Wnt3 is 
essential!®, In the current study, we exploit the mammalian mini-gut 
system as an experimental equivalent of the fly imaginal disk to dissect 
paracrine Wnt signalling. 

We generated a mouse Wnt3 allele, by introducing an haemagglu- 
tinin (HA)-tag at position Q41, located in a weakly conserved sur- 
face region opposite the Frizzled receptor binding site’? (Fig. la and 
Extended Data Fig. 1). This tagging strategy is more generally appli- 
cable to different Wnt members and peptide epitopes (Extended Data 
Fig. 2). While homozygous loss of Wnt3 in mice causes gastrulation 
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Figure 1 | Endogenous Wnt3 protein is localized to basolateral plasma 
membranes in the intestinal crypt. a, Generation of a Wnt3-HA knock-in 
allele. b, PCR genotyping; Wnt3!!4/"4 mice are viable. WT, wild type. 

c, Morphology of organoid cultures. d, HA immunodetection protocol 
using tyramide signal amplification (TSA). e, Confocal imaging in 
intestinal tissues. Co-staining of HA, Epcam (membranes) and WGA 
(secretory granules). f, Whole-mount staining of Wnt3"4/"4 organoid. 

g, Wnt3 signal on crypt membranes is independent of permeabilization. 
Three-dimensional projected confocal images (see Supplementary Video 2). 
h, Staining in Wnt3"™/"4;L.975"CF?!* organoids. Enriched signal on 

stem cells (Lgr5-GFP*) compared with TA cells (Ki-67*/Lgr5-GFP_ ). 
Quantification in n= 8 organoids (mean +s.d.; P< 10~°; t-test). 

Scale bars, 50 1m (c, e, f) and 101m (g, h). Paneth cells were identified on 
the basis of morphology and labelled by asterisks. 
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Figure 2 | Wnt3 transfer requires direct cell contact and has a limited 
range. a-d, Cell re-association of Wnt-dependent (Wnt3“/4) with Wnt3- 
producing Paneth cells. a, Experimental protocol; b, light microscopic; 
and ¢, epifluorescent images. Self-organization of crypts with dsRED- 
expressing Paneth cells. d, Mean organoid number (+s.d.) inn =3 
independent wells after 10 days. e, Measurement of epithelial Wnt3-range. 
Re-association using Wnt3-HA producing cells (non-labelled) and wild-type 
cells expressing dsRED. Z-projected confocal images and colour-coding 
depending on the distance to the closest Wnt3-HA producing cell. 

f, Average percentage (-ts.d.) of each distance fraction in n= 10 organoids. 
Scale bars, 200 1m (b, c) and 25 tum (e). 


arrest’, Wnt3#4/"4 knock-in mice are viable and fertile (Fig. 1b). To 
study pathway activity, we collected crypts from homozygous and 
control mice either directly, or after 2 weeks of mini-gut culture. 
Quantitative reverse transcription PCR (RT-qPCR) analysis showed 
no differences in the levels of Wnt target gene expression (Extended 
Data Fig. 3). In concordance, Wnt3"4/"4 organoids displayed normal 
growth and morphology (Fig. 1c). Together, the internal HA-tag fully 
preserved Wnt3 function. 

Western blot analysis of mini-guts revealed low expression of the 
Wnt3-HA protein, only detectable after conversion of all cells into 
Wnt3-producing Paneth cells”! (Extended Data Fig. 4a, b). On intes- 
tinal sections, a multi-step amplification protocol allowed specific 
Wnt3-HA detection in Paneth cells (Fig. 1d, e). Paneth cell granules 
were negative, implying that Wnt3 is not secreted via this dominant, 
apical secretion route (Extended Data Fig. 4c, d and Supplementary 
Video 1). To obtain high-resolution information in a system amenable 
to experimental manipulation, we derived mini-guts from homozy- 
gous knock-in mice. We observed fluorescence exclusively in the crypt 
equivalents within the mini-guts, while the villus-domain was negative 
(Fig. 1f). Confocal laser microscopy demonstrated that Wnt3 covers 
basolateral crypt membranes (Fig. 1g, Extended Data Fig. 4e, f and 
Supplementary Video 2). This staining pattern was not dependent on 
membrane permeabilization, indicating that Wnt3 is mainly local- 
ized on the external cell surface. Co-localization with the Lgr5"@"? 
reporter’? and KI-67 showed a specific enrichment on Lgr5 stem cells 
compared with transit-amplifying (TA) cells (Fig. 1h). 

To determine whether the mini-gut cultures secrete diffusible Wnt 
activity, we tested if Wnt3-producing, wild-type organoids could res- 
cue the growth of Wnt3-/4 (knockout) cultures. Fluorescently labelled 
mini-guts were co-seeded in Matrigel. We found that the presence 
of wild type (dsRED) did not support growth of Wnt3“/ mini-guts 
(green fluorescent protein, GFP) (Extended Data Fig. 5a—d), argu- 
ing that diffusible Wnt activity was negligible. Consistently, we could 
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Figure 3 | Frizzled receptors act as membrane tether of Wnt3. 

a, b, R-spondin (Rspo) controls membrane localization in APC knockout 
(a) but not in Rnf43/Znrf3 double mutants (RZ DKO; b). Epcam staining 
of cell membranes; arrows show intracellular staining. c, Mean percentage 
of organoids (+s.d.) with membranous or intracellular staining. Data 
from n=3 independent experiments. d, Neutralizing pan-Frizzled 
antibody (anti-Fzd) reduces surface expression of Wnt3-HA with residual 
intracellular signal (arrows). Treatment for 48 h; CHIR-99021 was added to 
rescue organoid growth. The z-projections of confocal images are shown; 
all scale bars, 251m. e, Model for control of the Wnt3 localization by 
Frizzled and the E3 ligases Rnf43/Znrf3. 


neither detect transfer of Wnt3-HA between organoids, nor interfere 
with growth of Wnt3"“/"4 cultures by embedding anti-HA affin- 
ity beads that efficiently sequester diffusible factor (Extended Data 
Fig. 5e, f). 

To investigate direct protein transfer between adjacent epithelial 
cells, we performed re-association assays after single-cell dispersal of 
fluorescently labelled organoids. GFP-expressing Wnt3~V cells were 
mixed with in vitro differentiated wild-type Paneth cells (dsRED). 
Aggregates were transferred to Matrigel and cultured in the absence of 
exogenous Wnt (see experimental scheme in Fig. 2a). Wild-type Paneth 
cells rescued the growth of Wnt3“V4 cells (Fig. 2b-d). We observed 
efficient self-organization into crypt structures with a stereotypic 
arrangement of dsRED-labelled cells at the budding tips (Fig. 2c), con- 
sistent with the model that Paneth cells serve as symmetry-breaking 
crypt organizers in culture. Staining revealed spatial confinement of 
Wnt3-HA around individual Paneth cells (Supplementary Video 3). 
To measure the extent of Wnt3 propagation within the epithelial 
sheet more easily, we generated spherical organoids (by addition 
of Wnt3a-conditioned medium (CM))’. Interfaces between 
Wnt3-HA-producing cells (non-labelled) and wild-type receiving cells 
(dsRED-labelled, which only express non-tagged Wnt) were readily 
recognizable. Wnt3-HA penetrated one cell diameter (two at the most) 
into the Wnt3-HA-negative domains (Fig. 2e, f). Strong differences 
in Wnt3-HA decoration were observed between adjacent cells. Thus, 
localized production and limited epithelial spreading appeared impor- 
tant hallmarks of Wnt3 distribution. 

We subsequently investigated if Wnt3-HA binding to stem-cell 
membranes depends on their cognate Frizzled receptors. As crypts 
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Figure 4 | Cell proliferation influences Wnt3 surface level and signal 
range. a, Chase experiment: 24h after block of Wnt-production (IWP-2), 
Wnt3-HA is sequestered in WGA-positive Paneth cells (top) and absent 
from the cell surface (non-permeabilized samples; bottom). Parallel 
block of proliferation by inhibitors of EGFR, MEK or CDK4/6 causes 
retention of surface HA signal. b, Quantification of Wnt3 surface level 
and transcriptional activity. Grey bars, mean HA-staining intensity in 
n=6 non-permeabilized organoids per condition. Blue bars, mean Axin2 
expression in n = 3 independent wells; error bars, s.d.; t-test compared 
with IWP-2 condition: ***P < 1073; **P < 10~*. ¢, Pulse experiment after 
Wnt-release (washout of IWP-2/Wnt-CM). Crypts formed in normal 
medium (arrows) are positive for Wnt3-HA. After block of proliferation, 
rudimentary crypts are formed (arrowheads) and the Wnt3-HA signal 
remains focused (brackets). Z-projected confocal images. All scale 

bars, 501m. 
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express multiple Frizzleds®, we altered their surface expression 
indirectly by modulation of the transmembrane E3-ligases Rnf43 
and Znrf3. These act as negative feedback regulators of Wnt signal- 
ling by ubiquitinylation and subsequent lysosomal degradation of 
Frizzleds””**. Of note, Lgr4/5 proteins and their R-spondin ligands 
antagonize Frizzled downregulation by recruiting Rnf43/Znrf3 
into a trimeric complex”**. Given that organoid growth requires 
R-spondin}®, we first tested its effect on Wnt localization. We found 
that the Wnt3-HA signal expanded or decreased in an R-spondin1 
concentration-dependent manner (Extended Data Fig. 6a). Because 
Wnt3 transcription was also strongly R-spondin-dependent (Extended 
Data Fig. 6b), we aimed to uncouple the effect of R-spondin on 
Frizzled-turnover from downstream signalling. To this end, we 
genetically activated the pathway in Wnt3"4/"4 cultures. CRISPR/ 
Cas9-induced null mutations in APC or in Rnf43/Znrf3 (Extended 
Data Fig. 7)”° resulted in R-spondin-independent growth. Wnt3-HA 
was absent from the surface of APC mutant mini-guts (that express 
high levels of Rnf43/Znrf3), but could be restored by addition of 
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R-spondin (Fig. 3a, c). This implied that Frizzleds tether Wnt3 to 
membranes, as has been suggested in the Drosophila wing disc”® 
Consistently, Wnt3 surface localization was high and insensitive to 
R-spondin in Rnf43/Znrf3 mutant cultures (Fig. 3b, c). In cell re- 
association assays, Wnt3-HA transfer to Rnf43/Znrf3 mutant cells was 
increased and to APC mutant cells was reduced (Extended Data Fig. 8). 
Independently, the application of a neutralizing pan-Frizzled anti- 
body?’ strongly reduced Wnt3-HA surface expression (Fig. 3d). 
Together, these experiments indicate that surface-expressed Frizzled 
binds and retains Wnt3 (Fig. 3e). 

To explore the kinetics of Wnt3 surface expression, we acutely 
blocked Wnt secretion using the Porcupine inhibitor IWP-2 (ref. 28). 
After 24h, Wnt3-HA was observed exclusively inside Paneth cells, 
highlighting the requirement of Porcupine for Wnt trafficking to the 
plasma membrane of these cells (Fig. 4a). Staining of non-permea- 
bilized organoids confirmed that surface-bound Wnt3 was absent, 
indicating that its half-life is below 24h. To test if cell division has an 
impact on propagation of surface-bound Wnt3, we induced cell cycle 
arrest using the EGF-receptor inhibitor Gefitinib, the downstream 
MEK inhibitor PD-0325901, or the CDK4/6 inhibitor Palbociclib 
(Extended Data Fig. 9a, b). In all cases, residual Wnt3 surface signal 
and increased transcriptional activity was observed 24h after initiation 
of IWP-2 treatment (Fig. 4a, b), indicating that cell division under 
normal conditions dilutes cell-bound Wnt3. 

To address if cell division is indeed the means by which Wnt3 is 
propagated from the Paneth cell source, we blocked Wnt secretion for 
3 days, followed by washout of the Porcupine inhibitor. Formation of 
multiple crypt-like structures, invariably decorated with a gradient 
of Wnt3-HA, was observed within 2 days after IWP-2 washout (Fig. 4c 
and Supplementary Video 4). In the presence of EGFR, MEK or 
CDK4/6 inhibitors, crypt formation was blocked and Wnt3-HA 
remained localized around the producing cells. These results suggest 
that signal propagation is dependent on cell division. 

From these combined observations, we concluded that two phe- 
nomena control the size and shape of the short-range, graded Wnt3 
signal. First, Wnt3, produced by Paneth cells, does not freely diffuse, 
but is transferred to the nearest neighbour, typically an Lgr5 stem cell. 
Frizzled receptors tether Wnt to the membrane of the receiving stem 
cell. Surface expression of the Wnt-Frizzled complex is negatively con- 
trolled by Rnf43/Znrf3, which in turn can be alleviated by Lgr4/5 and 
R-spondin. Second, Frizzled-bound Wnt spreads passively by stem-cell 
division, which dilutes surface-bound Wnt and thus creates a gradient. 

Similar highly localized Wnt signals have been described in inver- 
tebrate models!*. Long-range signalling is dispensable during fly 
embryogenesis, but transfer over longer distances becomes impor- 
tant during post-juvenile organ growth‘. Apparently, the body dimen- 
sions then become too large to support Wnt-signalling only by auto/ 
juxtacrine sources. From an evolutionary perspective it is interesting 
that the R-spondin/Lgr Wnt signal amplifier is a vertebrate-specific 
innovation, while Rnf43 is already present in Caenorhabditis elegans. 
Organ size in vertebrates may require amplified cellular output from 
stem-cell niche compartments. The R-spondin/Lgr signalling module 
appears designed to prolong/increase Wnt surface levels once cells 
have left contact to the niche to build a TA compartment. Unlike 
the Drosophila wing disc, where autocrine Wnt signalling appears to 
predominate’, Wnt signalling is non-cell-autonomous and relies on 
the presence of dedicated niche Paneth cells as well as non-epithelial 
sources!®!”, Transfer within the epithelium requires direct Paneth cell 
contact (this study), a notion that is in agreement with our previous 
observation that stem cells attempt to maximize their membrane con- 
tact to Paneth cells”. This also explains why a central position of Lgr5 
stem cells between Paneth cells supports stem-cell maintenance more 
efficiently than a peripheral position”. It has been demonstrated 
that topical application of Wnt (immobilized on a bead) to one side 
of a stem cell results in an asymmetrical outcome of the subsequent 
stem-cell division*®. Crypt stem cells located at the niche boundary 
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experience a similarly sharp Wnt gradient, as only one side of a 
‘boundary stem cell’ touches its Wnt source, the Paneth cell. Stem- 
cell plasma membranes thus carry positional information that could be 
involved in timing of differentiation. How Wnt3 is transferred between 
producing and receiving cells remains to be determined. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Mice. For generation of the Wnt. allele, a 1,044-base-pair genomic 
fragment covering the exon 2 of the gene using the primer pair forward 
5!-TCCTTTGTCCTTTATATTTGGATTC-3’ and reverse 5/-AGGGAAATGTCA 
CACATGTCTAC-3’ was subcloned and the HA-tag was introduced using 
the following PCR primers: forward 5’-GGCACGTCGTATGGGTA 
CTGGGAGGCCAGAGATGTGTAC-3’ and reverse 5‘’-AGACTACGCATCTCT 
GCCTCTGCTCTGCGGCTCCATC-3’ (underlined residues encoding for the 
epitope tag). The fragment was subsequently amplified with primers flanked 
by LoxP and LoxM sites, and introduced in reverse orientation into the pL451 
plasmid that contained a Pgk-neomycin-pA selection cassette flanked by Frt-sites 
by recombination cloning (In-Fusion HD, Clontech). The wild-type exon 2 and 
the 5’- and 3/-homology regions were introduced in a configuration that allowed 
excision of the wild-type exon 2 and inversion of the HA-tagged exon 2 after Cre- 
mediated recombination between the two pairs of LoxP and LoxM sites (Extended 
Data Fig. 1c). After sequence verification the linearized targeting vector was used 
for homologous recombination in embryonic stem cells that was confirmed 
by Southern blot analysis (Extended Data Fig. 1d). After blastocyst injection, 
chimaeric offspring was crossed to FLPeR mice*! to remove the neomycin expres- 
sion cassette. Germline transmission was confirmed by PCR and mice were 
crossed to Pgk-Cre* mice to induce germline recombination of the Wnt3" allele. 
The Lgr5®O!? IRES-CreERT? allele, the conditional Wnt3!' allele and the transgenic 
Vil-CreERT2 line have been described!>***4, Alleles were maintained on a mixed 
C57BL/6 background and littermate groups of mixed sexes were analysed at ages 
between 8 and 12 weeks. Experiments were performed according to guidelines and 
reviewed by the Animal Experiments Committee (DEC) of the Royal Netherlands 
Academy of Arts and Sciences. 

Organoid culture. Mouse organoids were established and maintained as 
described!*"® from isolated crypts collected from the entire length of the small 
intestine. The basic culture medium (ENR) contained advanced DMEM/F12 sup- 
plemented with penicillin/streptomycin, 10 mM HEPES, 1 x Glutamax, 1 x B27 
(all from Life Technologies) and 1mM N-acetylcysteine (Sigma) that was supple- 
mented with murine recombinant EGF (Peprotech), R-spondin1-CM (5% final 
volume if not indicated otherwise) and Noggin-CM (10% v/v). Wnt3a-CM was 
used where specified at 50% (v/v), if not indicated otherwise. A mycoplasma- 
free status was confirmed routinely. Wnt3-null cultures were obtained from 
Vil-CreERT2; Wnt3!"/!! organoids after treatment with 0.5 1M 4-OH-tamoxifen 
overnight and were then maintained by addition of Wnt3a-CM. For constitutive 
lentiviral expression of mouse Wnt3, Wnt3-HA and Wnt3-Flag, the open reading 
frames were inserted 3’ of a Pgk promoter in the lentiviral vector pLV.IRES-puro. 
For cell labelling Pgk::EGFP-IRES-puro or Pgk::dsRED-IRES-puro containing 
lentivirus was used and transduced as described*®. Puromycin (Invivogen) was 
included in the medium (0.5-1,.gml~') and fluorescent reporter expression was 
monitored and documented using an EVOS fl inverse fluorescence microscope 
(Life Technologies). 

For bead neutralization experiments, the affinity matrix (anti-HA (clone 
3F10)-coupled agarose beads, Roche) was washed thoroughly in medium before 
resuspension in Matrigel (in a ratio 1:6 (v/v)) and organoid embedding. Culture 
media were pre-incubated with 5011 ml~! washed beads before addition. For cell 
re-association experiments, cells were collected 5 days after seeding, either in reg- 
ular medium containing Wnt-CM or Paneth cell differentiation medium”? in the 
presence of 11M CHIR-99021 (Stemgent) and 15j1M DAPT (Sigma-Aldrich) in 
TrypLE Express (Life Technologies) and digested for 2 x 5 min at 37°C followed 
by gentle mechanical dispersal. Cells were washed in cold medium containing 
5% FBS and resuspended in medium containing 1,000 U ml“! DNase I (Roche) 
before filtration and centrifugation. The cells were then resuspended in medium 
containing 10% dissolved Matrigel (BD), 11M CHIR-99021 and 10j1M Y-27632 
(Sigma-Aldrich). Eight thousand Paneth cells and/or 24,000 GFP-positive cells 
were co-seeded in 48-well suspension plates that had been pre-blocked with 5% 
FBS/PBS. After 24h of re-association, the aggregates were collected, seeded in 
Matrigel and cultured in the presence of 1 j1M CHIR-99021 which was withdrawn 
after 48 h. Organoid number was counted after 10 days and the experiment was 
replicated twice. For staining of re-associated organoids 25% (v/v) Wnt-CM 
was added to the normal culture medium to generate large mosaic spheroids, 
containing increased number of Paneth cells'®. Here dsRED-labelled cells and 
non-labelled Wnt34/™ cells were mixed in a 2:1 ratio. Blocking pan-Frizzled 
antibody (clone OMP-18R35 (ref. 27)) was added to the culture medium at a con- 
centration of 50j1g ml! in the presence of 34M CHIR-99021. For Wnt3-HA 
chase experiments, 2.5 1M IWP-2 (Stemgent) was added to the normal medium. 
Gefitinib (AstraZeneca, 1 |1M), PD-0325901 (Sigma, 21M) and Palbociclib (PD 
0332991, Sigma, 1.5|1M) were added 6h before [WP-2-treatment to block the 
cell cycle. For pulse experiments, organoids were cultured in medium containing 


3HA 


Wnt-CM/IWP-2 for 3 days before washing and growth in normal culture medium 
with cell cycle inhibitors (as above). Time-lapse videos were recorded on a Leica 
AF7000 microscope using 30 min intervals. 

CRISPR/Cas9 mutagenesis. Mutations in mouse organoids using the CRISPR/ 
Cas9 technology were induced as described”. For targeting of mouse APC, the 
reported single guide RNA (sgRNA)-5 and for Rnf43 and Znrf3 sgRNA-3 and 
sgRNA-2 were used, respectively. Correct targeting was confirmed by sequencing 
of Escherichia coli-cloned genomic PCR products (Extended Data Fig. 7). For each 
genotype, two independent clones were studied. 

Immunostaining. Intestinal tissues were fixed 4% formaldehyde/PBS for 20 min at 
room temperature (22°C) followed by embedding Tissue-Tek O.C.T. compound. 
Cryosections (501m) were incubated with Mouse on Mouse Blocking Reagent 
(Vector Laboratories) and then with rat anti-HA monoclonal antibody (clone 3F10; 
Roche; 0.2 1g ml!) overnight at 4°C, followed by incubation with unconjugated 
rabbit anti-rat IgG(H-+L) (SouthernBiotech, 2.5 1g ml}, 1h at room tempera- 
ture) and BrightVision Poly-HRP-Anti Rabbit reagent (ImmunoLogic, 1h at room 
temperature) that were both pre-blocked using normal mouse serum. For detec- 
tion TSA (Life Technologies) was used according to the supplier's instructions 
with Alexa Fluor 488 tyramide diluted 1:100 in amplification buffer. Wheat Germ 
Agglutinin, Texas Red-X Conjugate (Life Technologies, 5 j1g ml‘) was applied 
for 1h at room temperature before the slides were embedded using ProLong Gold 
Antifade reagent (Life Technologies). 

For whole-mount stainings, organoids were collected in ice-cold medium, pel- 
leted and resuspended in chilled Cell Recovery Solution (BD). After incubation 
(15 min on ice) and gentle mixing, organoids were washed in medium, pelleted, 
resuspended and fixed in 2% formaldehyde/PBS overnight at 4°C. Organoids were 
subsequently permeabilized in PBS 0.1% Tween 20 (30 min at room temperature) 
before resuspension in blocking solution containing 0.2% normal donkey serum/ 
PBS at 4°C for 1h. After incubation in anti-HA antibody (as above, 0.5 1g ml~ 14°C 
overnight), three wash steps were performed at room temperature by addition of 
ice-cold PBS 0.1% Tween 20 and organoid sedimentation. For staining without per- 
meabilization, organoids were fixed overnight in 2% formaldehyde/PBS containing 
4% sucrose before washes in blocking buffer and incubation with primary antibody 
in the absence of detergent. Subsequent incubation steps for both protocols were 
rabbit anti-rat IgG(H+L) (as above) and BrightVision Poly-HRP-Anti Rabbit reagent 
(as above; 1:2 diluted in PBS with final 0.1% Tween 20, 0.1% Triton X-100). At this 
step Wheat Germ Agglutinin (as above), 4’,6-diamidino-2-phenylindole (DAPI), 
eFluor-660-conjugated rat anti-K167 (eBioscience; clone SolA15; 0.4 1g ml~ 1) and/ 
or anti-mouse-Epcam-APC (eBioscience; clone G8.8; 0.5 1g ml!) were added, fol- 
lowed by washes and pre-equilibration in TSA amplification buffer (PBS pH 7.6 with 
0.1mM immidazole). Specimens were incubated for 30 min at room temperature 
in a 1:1,500 dilution of TSA (as above or using TSA-Cy3; PerkinElmer), followed 
by washes and mounting (as above). Specimens were documented on a Leica SP8 
scanning confocal microscope using a x 20 objective and z-step size of 1jum. A x63 
objective with 0.5|1m z-steps was used for Extended Data Fig. 4d, f. 

Image analysis. Fiji software was used to generate z- and three-dimensional 
projections that were exported as video files. Channels were overlaid using Adobe 
Photoshop CS6. For co-localization analysis, presence of Lgr5-GFP, anti-HA and 
anti-Ki-67 staining was determined in each DAPI-positive cell on confocal z-projections 
of crypt hemispheres (n =8 representative specimens). The Wnt3-HA range in 
mosaic organoids was determined on z-projected confocal images. For this, Epcam 
signals were imported into CellProfiler software* and analysed using the example 
pipeline “Tissue Neighbours’ that was adapted to trace all cell outlines. This mask 
was overlaid on dsRED and Alexa Fluor 488 signals to identify Wnt3-HA pro- 
ducing (dsRED negative) and Wnt3-HA receiving (dsRED positive) cells. Alexa 
Fluor 488 signals were thresholded to determine the cell-distance to the nearest 
producing cell: receiving cells were labelled positive only if a signal was found 
ona membrane side not in direct contact with an adjacent Wnt-HA producing 
cell. To omit cells that were engulfed by producers, receiving cells were excluded 
if they had four or more dsRED-negative direct neighbours. Cells were counted 
as ‘no transfer’ events only if they had direct dsSRED-negative neighbour(s). The 
distance fractions were determined in each n= 10 representative mosaic organoids 
from experiments that were repeated three times. For surface-level measurement, 
Wnt3-HA and Epcam images were recorded using constant microscope settings. 
Pixel intensities (whole image) were measured using Fiji. For each specimen the 
Wnt3-HA intensity was normalized to the Epcam intensity to adjust for organoid 
size. Six representative crypt hemispheres were recorded per condition and the 
experiments were repeated three times. 

Western blotting, Wnt-conditioned media and reporter assay. HEK293T cells 
were transfected with pcDNA3 expression vector that encoded the ORF of the 
mouse Wnt3 complementary DNA (cDNA), which contained an HA epitope at 
the corresponding position (Q41) that was introduced by PCR mutagenesis with 


© 2016 Macmillan Publishers Limited. All rights reserved 


the primers described above. Whole cell/organoid lysates were collected in 1x 
denaturating SDSpage Buffer, before western blot analysis using anti-HA (as above, 
0.5,1g ml! in PBS containing 5% milk powder). For generation of Flag-tagged 
(Extended Data Fig. 2c) or HA-tagged (Extended Data Fig. 5f) mouse Wnt3a, an 
internal epitope was inserted at the equivalent position (Q38). A construct with 
HA-tagged mouse Wnt3a was used for Extended Data Fig. 5f. For production of 
Wnt-conditioned media, murine L-cells were transfected with Wnt3a constructs 
using FuGENE (Promega). Stable, clonal cell lines were established after selection 
using Zeocine. Media were harvested after 3-4 days of conditioning and analysed 
by western blotting using mouse anti-Wnt3a (a gift from R. Takada) and mouse 
anti-Flag M2 (Sigma) antibodies. HEK293T cells were transiently transfected 
with TOPFlash or FOPFlash reporter constructs and TK-Renilla as a transfection 
control. Cells were stimulated overnight with either control L-cell medium or 
Wnt3a-conditioned medium. Luciferase activity was measured using the Dual 
Luciferase Reporter kit (Promega) according to the manufacturer's protocol. 
qPCR analysis. RNA preparation, cDNA preparation, qPCR and primer sequences 
have been described!®. 

EDU incorporation assay. EDU incorporation assay was performed using the 
Click-iT EdU Alexa Fluor 488 Flow Cytometry Assay Kit (Life Technologies). Cells 
were treated with 10;.M EDU 1h before harvesting and single cell dispersal (as 
above) for fluorescence-activated cell sorting (FACS) analysis. Genomic DNA con- 
tent was measured by addition 0.5 1g ml! 7-AAD (eBioscience) before analysis. 
Statistical analysis. Sample size was chosen empirically following previous 
experience in the assessment of experimental variability. No statistical methods 
were used to predetermine sample size. Samples were not randomized and the 
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investigators were not blinded. After the normal distribution was confirmed using the 
Shapiro-Wilk test, significant differences between two groups were evaluated using 
two-tailed, unpaired Student's t-tests. No samples/specimens were excluded from 
the statistical analysis. Differences were considered to be significant when P < 0.05. 
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b one 
signal peptide HA epitope tag: 
wnt3_MOUSEIP17553— 1 ~-LLLGLDLSGTRVLAGYPI--WHSLALGQQYTSLAS; LLCGSTPGLVPKQLRFCRNYIEIMPSVAEGVKLGIQECQHQFRGRRWNCTTIDDSLALF 101 
wnt3_HUMANIP56703.— 1 ~-LLLGLLLOGTRVLAGYPI--WWSLALGQOYTSLGSQP LLCGSTPGLVPKQLRFCRNY IETMPSVAEGVKLGIQECQHOFRGRRWNCTTIDDSLATF 101 
wntsa_MOUSEIP22725. 1 {1GTL:SPGVALGTAGGAMSSKFFLMALATFFSFA~~~~QVV1EANSWWSLGMNNEVON--~~SEVY I IGAQPLCSQUAGLSQGQKKLCHLYQDHMQY1GEGAKTGIKECQYQFRHRRWNCSTVDNT-SVF 124 
N-terminus wntSa_HUMAN!P41221. 1 {IGTLiS PGVALGMAGSAMSSKF#LVALALFF SFA-~~~QVVTEANSWWSLGMNNPVOM--~~SEVY I TGAQPLCSQLAGLSQGQKKLCHLYQDHNQYIGEGAKTGIKECQYQFRHRRWNCSTVDNT-SVF 124 
wnt1 MOUSE! P04426 i ~~ PSWVSTTLLLALTALPAA~~~-LAANSSGRWWGIVNIASSTNLLTDSKSLQLVLEPSLOL ~~-LSRKQRRLIRQNPGILHSVSGGLOSAVRECKWOFRNRRWNCPTAPGP-HLF 113 
wnt1_HUMAN!PO4628 i PGHVSATLLLALAALPAA----LAANSSGRWWGIVNVASSTNLLTDSKSLQLVLEPSLOL~~~LSRKQRRLTRQNPGILHSVSGGLQSAVRECKWQFRNRRWNCPTAPGP-HLF 113 
wnt2b_MOUSE1070283 1 MLKLOGEDEAAQLAPRRAT PSSARLGLACLLLLLGLT-~--LPARVDTSWHYIG: ‘ALGARVICDNIPGLVSRQRQLCQRY PDIMRSVGEGAREWIRECQHQFRHHRWNCTTLDRDHTVE 126 
Wnt2b_HUMAN!Q93097 1 MLRPGGAEEAAQLPLRRA RASARLGLACLLLLLLLT---~LPARVDTSWHY 1G: ALGARVICDNIPGLVSRQROLCORY PDIMRSVGEGAREWIRECQHQFRHHRWNCTTLDRDHTVE 126 
Wnt3a_HUMANIP56704. 1 w: APLGYPLLLCSLKQA~---LGSYPT--WWSLA\ ~--ILCASTPGLVPKQLRFCRNYVEIMPSVAEGIKIGIQECQHQFRGRRWNCTTVHDSLATF 98 
XWwnt8 wnt3a_MOUSEIP27467 1M. APLGYLLVLCSLKQA~~~-LGSYPI --WWSLAVGPQYSSLSTQP--------ILCASIPGLVPKQLRFCRNYVEIMPSVAEGVKAGIQECQHQFRGRRWNCTTVSNSLATF 98 
Wnt7a_HUMAN/000755 1 MNRKARRCLGHLF- ~ LSLGMVYLRIGGF SSVVALGASI ICNKIPGLAPRORATCOSRPDATIVIGEGSQMGLDECQFQFRNGRWNCSALGER-TVF 93 
wnt7a_MOUSEIP24383 i - LSLGIVYLRIGGF SSVVALGAST ICNKIPGLAPRQRATCQSRPDAIIVIGEGSQMGLDECQFQFRNGRWNCSALGER-TVF 93 
wnt®_XENLAIP28026 1 phan ~~ n= nea nen ee YLTYSASVAVGAQNGIEECKYQFAWERWNCPestiqlath 76 
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Extended Data Figure 1 | Generation of an epitope-tagged Wnt3 mouse 
allele. a, Reported three-dimensional protein structure of Xwnt8 (in 
turquoise) bound to Fzd8-CRD (in blue) (Protein Data Bank accession 
number 4F0A)”’. Dotted line depicts the location of the amino (N) 
terminus, where the HA epitope tag was introduced. Note that the position 
of the tag does not interfere with receptor binding. The image was created 
with the jmol-viewer*”. Glycosyl and palmitoyl groups are shown in green 
and yellow, respectively. b, Amino-acid alignment of the N-terminal 
region of Wnt3 and other Wnt-family members. Predicted signal peptides 
are indicated in red. The HA tag (YPYDVPDYASL) was inserted after 
position Q41 that is labelled in blue. The first residue of Xwnt8 that is 
resolved in the crystal structure in a (T32) is shaded in green. Small 
letters indicate non-conserved residues that could not be aligned by the 
software*®. The degree of local similarity is marked by asterisks. A score 
of five asterisks represents maximal similarity, which is not reached using 
the default settings of the algorithm. c, HA-knock-in strategy: a targeting 
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vector that comprised the normal exon 2, a Frt-site flanked Pgk-neomycin 
resistance-polyA cassette and an inverted HA-inserted exon 2 (placed in 
intron 1 of the gene) was used. LoxP and LoxM sides were introduced in a 
configuration that allowed excision of the wild-type exon 2 and inversion 
of the HA-tagged exon 2 by Cre-mediated recombination (orientation of 
LoxP and LoxM sites is indicated with arrowheads). Transgenic mice were 
crossed to Pgk-Cre mice for germline recombination of the allele. This was 
done because the non-recombined allele was apparently a null allele as no 
homozygous offspring could be obtained. We suspect that the antisense 
configuration of the HA-modified exon 2 could result in RNA duplex 
formation and masking of the regular splice acceptor site of exon 2. 
Consistently, we observed a shorter transcript lacking wild-type exon 2 
from the non-recombined allele by RT-PCR analysis. d, Southern blot 
analysis of ES cell clones to confirm correct targeting. Genomic BamHI 
digest was performed (see scheme in c). 
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mWNT3 1  MEPHLLGLLLGLLLSGTRVLAGYPIWWSLALGQQYTSLASQ----------- PLLCGSIPGLVPKQLRFCR 60 
mWNT3 -HA 1  MEPHLLGLLLGLLLSGTRVLAGY PIWWSLALGQQYTSLASQYPYDVPDYASLPLLCGSIPGLVPKQLRFCR 71 
mWNT3-Flag 1 MEPHLLGLLLGLLLSGTRVLAGYPIWWSLALGQQYTSLASQDYKDDDDK---PLLCGSIPGLVPKQLRFCR 68 
mWNT3A 1 MAP--LGYLL-VLCSLKQALGSYPIWWSLAVGPQYSSLSTQ----------- PILCASIPGLVPKQLRFCR 57 
mWNT3A-Flag 1 MAP--LGYLL-VLCSLKQALGSYPIWWSLAVGPQYSSLSTQDYKDDDDK---PILCASIPGLVPKQLRFCR 65 
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Extended Data Figure 2 | A permissive internal location for 
introduction of epitope tags in Wnt proteins. a, Amino-acid sequences 
of tagged Wnt versions used in this study. Signal peptides are labelled in 
red. Protein alignment (Clustal Omega program, Uniprot); asterisk, colon 
and dot symbols indicate full conservation, and groups of strongly and 
weakly similar properties, respectively. b, Lentiviral rescue experiment 

in Wnt-dependent (Wnt3“/4) small intestinal organoids. Top: average 
organoid number (+s.d. in n =3 independent wells) in the absence of 
exogenous Wnt-conditioned medium (Wnt-CM). Introduction of an 


internal HA- or Flag-tag does not interfere with rescue activity of Wnt3 
lentivirus. Bottom: cell morphology after 5 days (passage 0) in the absence 
or presence of Wnt-CM. Scale bar, 501m. ¢, Introduction of an internal 
Flag-tag in mouse Wnt3a results in functional protein secretion in L-cells. 
Top: TOPFlash assay using conditioned media from control L-cells and 
stable lines expressing Wnt3a and Wnt3a-—Flag. Bottom: immunoblots of 
the conditioned media measured above using anti- Wnt3a and anti-Flag 
antibodies. Fully scanned western blots are shown. 
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Extended Data Figure 3 | Wnt signalling status in Wnt3-HA knock-in 
crypts and organoids. Quantitative RT-PCR analysis of freshly isolated 
crypts (a) or established organoids (b, c) from the small intestine. 

a, b, Mean normalized expression levels in Wnt3"4/"4 relative to 
Wnt3"4/+-samples (n =6 mice, per genotype). c, Relative expression 
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to Wnt3t!+ organoids (n= 4 independent wells). Error bars, s.d., no 
significant changes were found (P > 0.05 as determined by Student's t-test). 
Normal expression of stem-cell markers (Lgr5, Olfm4), Wnt pathway 
activity (Axin2) and Paneth cell markers (Lyz1, Defaé and Wnt3) indicates 
that introduction of the HA-tag does not interfere with Wnt signalling. 
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Extended Data Figure 4 | Immunodetection of endogenous Wnt3-HA 
protein expression. Western blot analysis using cell lysates from wild-type 
controls or Wat3#4/#4 organoids after normal culture or directed 
differentiation to Paneth cell s (DAPT/CHIR-99021 treatment for 6 days 
(ref. 21)). a, In microscopic images, Paneth cell differentiation is evident 
by presence of a dark granular structure. b, Anti-HA western blot shows 

a specific signal corresponding to the Wnt3-HA protein (arrow; expected 
molecular mass of the mature protein chain is 39 kDa). As positive control, 
lysates from HEK293T cells transfected with an expression plasmid 
encoding the Wnt3-HA cDNA were used. Full-scan western blot is shown. 
c, Wnt3-HA staining (green signal) on small intestinal cryosections of 
wild-type control and Wnt3"4/"4 mice. Counterstaining of secretory 
granules (wheat germ agglutinin; WGA) and cell membranes (anti- 
Epcam). Data from Fig. le are shown in bright-field and single confocal 
channels (x20 objective) with magnified crypt region (right). Note that 
the HA-signal in Paneth cells is mutually exclusive with WGA-positive 
apical granula. d, High-magnification confocal image of crypt (x63 
objective) merged with bright-field channel or nuclear staining (DAPI). 
Paneth cells were identified by their granular morphology (asterisks); 
arrows label crypt membranes adjacent to Paneth cells. Single confocal 
image is shown (for the entire z-stack see Supplementary Video 1). 

e, Whole-mount staining of wild-type control and Wnt3/4/#4 small 
intestinal organoids. Confocal images or z-projections of co-stainings as in c 
(x20 objective). f, High-magnification confocal image (x 63 objective) of a 
crypt region in Wwat3ta/Ha organoid. Scale bars, 501m (a, ¢, e) and 101m (d, f). 
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Extended Data Figure 5 | Diffusible Wnt activity in organoids is 
neither sufficient nor necessary to support growth. a, Culture 

after co-embedding of dsRED-labelled wild-type organoids with GFP- 
labelled wild-type or Wnt3“/4 organoids in Matrigel. Organoid fragments 
were seeded in a 1:1 ratio. Wnt3Y4 cells cannot be propagated alone or 

in the presence of wild-type cells. Images after seeding (P0; day 3) and 
after passage 1 (P1; day 10). b, Co-culture at higher seeding density using 
dsRED-labelled wild-type organoids or in vitro differentiated Paneth 
cells”! with GFP-labelled Wnt3“/4 organoids in a 3:1 ratio. Images of the 
same wells are shown at 1 and 5 days after seeding. c, d, Quantification of 
results shown in a and b. Mean relative number of organoids (-s.d.) from 
n= 3 independent experiments. e, Anti-HA immunodetection following 
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co-culture of Wnt. organoids or in vitro differentiated Paneth cells 
with dsRED-labelled wild-type recipient organoids (3:1 ratio). Inserts 
show growth in Matrigel. Confocal z-projected images are shown. Note 
that dsRED-positive cells remain negative for Wnt3-HA. f, Bead depletion 
experiment. Wild-type, Wnt3"4/"4 or Wnt3~/4 organoids were either 
embedded Matrigel alone or together with anti-HA affinity beads (blue 
asterisks) to sequester diffusible HA-tagged Wnt. Bright-field images 
after 6 days of culture; note that Wnt3"4/"4 organoids display unaffected 
morphology in the presence of beads. Wnt3“/“ organoids do not grow in 
the presence of beads and L-cell derived Wnt3a-HA CM (black arrows) 
demonstrating efficient depletion. Scale bars, 500 1m (a, b) and 100 um (e, f). 
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Extended Data Figure 6 | Wnt3 protein localization and messenger 
RNA expression depend on the R-spondin (Rspo) concentration. 

a, Wnt3-HA immunostaining (Wnt3"4/"4 organoids) after culture for 

6 days in 1%, 5% or 25% Rspo-conditioned medium. Counterstaining 
of plasma membranes (Epcam) and secretory granules (WGA). Scale 
bar, 251m. b, RT-PCR analysis of stem-cell and differentiation markers 
in small intestinal organoids following 6 days of culture in variable 
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concentrations of Rspo. Shown are mean normalized expression levels 
(+s.d.) in n= 3 independent wells relative to organoids cultured in the 
ormal concentration of 5% Rspo. Markers of stem cells (Lgr5, Olfm4), 
Wnt activity (Axin2), Paneth cells (Lyz1, Defa6, Wnt3), enterocytes (Alpi), 
endocrine cells (Chga) and goblet cells (Muc2). Note that the messenger 
RNA expression of Wnt3 is sensitive to reduced Rspo concentration. 
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@ arc 
clone#1 
WT AAAAGCGTTTTGAGTGCCTTATGGAACCTGTCTGCACACTGCACTGAGAATAAGGC 
seq 5 ATGCGTTTTGAGTGCCTTAT-GAACCTGTCTGCACACTGCACTGAGAATAAGG 
seq 6 GCGTTTTGAGTGCCTTAT-—GAACCTGTCTGCACACTGCACTGAGA 
seq 7 CGTTTTGAGTGCCTTAT-—GAACCTGTCTGCACACTGCACTGAGAATAAGG 
seq 8 CGTTTTGAGTGCCTTAT-—GAACCTGTCTGCACACTGCACTGAGAATAAGG 
seq 10 AAAAGCGTTTTGAGTGCCTTAT-—GAACCTGTCTGCACACTGCACTGAGAATA 
clone#2 
WT AAGCGTTTTGAGTGCCTTAT-—GGAACCTGTCTGCACACTGCACTGAGAATAA 
seq 11 TTTGAGTGCCTTATGGGACCCTGTCTGCACACTGCACTGAGAATAAG 
seq 12 TTTTGAGTGCCTTATGGGAACCTGTCTGCACACTGCACTGAGAATAAGG 
seq 13 CGTTTTGAGTGCCTTATGGGAACCTGTCTGCACACTGCACTGAGAATA 
seq 14 AAGCGTTTTGAGTGCCTTATGGGAACCTGTCTGCACACTGCACTGAGAATAAG 
seq 15 CGTTTTGAGTGCCTTATGGGAACTGTCTGCATACTGCACTGAGAATAA 
seq 16 GCGTTTTGAGTGCCTTATGGGAACCTGTCTGCACACTGCACTGAGAATAAG 
Db RnFr43 
clone#1 
wr CCGGCTGTAGACGAGCCCGAGCC--—GAGTGGCCAGACTCGGGGAGTAGCTGCAGCT 
seq 9 CCCGAGCCACGAGTGGCCA 
seq 10 CCCGAGCC-CGAGTGGCCAGA 
seq 11 CCCGAGCCACGAGTGG 
seq 12 CCCGAGCC-CGAGTGGCCA 
seq 15 CCCGAGCC-CGAGTGGCCA 
seq 16 CCCGAGCCACGAGTGGCC 
clone#2 
wr CCGGCTGTAGACGAGCCCGAGCCGAGTGGC CAGACTCGGGGAGTAGCTGCAGCTC 
seq 17 CCCGAGC-GAGTGGCC 
seq 18 CCCGAGC-—GAGTGGCC 
seq 19 CCCGAGC-GAGTGGCC 
seq 21 CCCGAGCC-AGTGGCC 
seq 22 CCCGAGCC-AGTGGCC 
seq 23 CCCGAGC-GAGTGG 
C 2NRF3 
clone#1 
WT ATCCATGGCTACTGCAGCACCACACCTGCCCCCACTGTCGGCACAACATCATAGGTA 
seq 25 CAGCACCACACCTGCC--CACTGTCGG 
seq 26 GCAGCACCACACCTGCC--CACTGTCGGCACAAATCATAGG 
seq 27 AGCACCACACCTGCC--CACTGTCGGCACACATCATAGGTA 
seq 28 CAGCACCACACCTGCC--CACTGTCGGCAAACTCTAGGTAA 
seq 29 GCAGCACCACACCTGCC--CACTGTCGGCAAAATCATAGGT 
seq 30 GCAGCACCACACCTGCC--CACTGTCGGCACAA 
clone#2 
WT ATCCATGGCTACTGCAGCACCACACCTGCCCCCACTGTCGGCACAACATCATAGGTA 
seq 33 GCAGCACCACACCTGCC--CACTGTCGGC 
seq 34 TAGCAGCACCACACCTGCC--CACTGTGGGC 
seq 35 CAGCACCACACCTGCC-—-CACTGTCGGCA 
seq 36 GCACCACACCTGCC--CACTGTCGGCACA 
seq 37 CAGCACCACACCTGCC-—-—CACTGTCGGC 
seq 38 CAGCACCACACCTG--—-—------— TCGGCACAACATCATAGGTA 
Extended Data Figure 7 | CRISPR/Cas9 induced mutations in colonies were sequenced per clonal line. For APC (clone#1 and #2) and 
organoids. a—c, Sequence analysis of indel mutations in targeted regions Znrf3 (clone#1) mutant lines, hemizygosity was found, suggesting a larger 
of mouse APC (a), Rnf43 (b) and Znrf3 (c) genomic loci. Two independent —_ genomic deletion on the other chromosome. sgRNA target sequences 
clonal organoid lines were analysed for each genotype. Genomic loci are shown in blue with the PAM sequence marked in bold. Arrows, Cas9 
were PCR amplified, fragments were subcloned in E. coli and five or six cleavage sites. Inserted nucleotides are shown in red. 
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Extended Data Figure 8 | Differential efficiency of Wnt3 transfer the success of transfer and the distance to the closest Wnt3-HA neighbour. 
to APC- and Rnf43/Znrf3-deficient cells. a, Re-associated epithelia b, Average percentages of each distance fraction in n = 10 re-associated 
to test Wnt3 decoration on receiving cells (dsRED positive). The organoids (+s.d.). Data were compared by f-test to re-association 
z-projected confocal images show depleted HA-signal on APC mutant experiment using wild-type receiving cells (shown in Fig. 2f). ***P< 107%; 
cells (arrowheads) and enriched staining on Rnf43/Znrf3 mutant cells *&P < 10-7; NS, non-significant. Scale bars, 10 jum. 


(asterisks). Wnt3 range: receiving cells were colour-coded depending on 
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Extended Data Figure 9 | Pharmacological inhibition of proliferation in 
organoids. Cell cycle status of wild-type mouse small intestinal organoids 
was determined by flow cytometry. EDU incorporation assay in controls 
and 24h after administration of EGFR-inhibitor (Gefinitib), MEK-inhibitor 
(PD-0325901) or CDK4/6-inhibitor (Palbociclib) to the regular culture 
medium. EDU was added 1h before collection and dissociation of cells 

for FACS analysis. a, Original FACS data. Single cells were gated using 
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FSC/SSC characteristics. EDU signals were plotted against 7-AAD signals 
(DNA content). b, Inhibitor titration. Relative percentage of EDU-positive 
cells is shown for two independent experiments (grey line shows average). 
Experiments in Fig. 4 were performed at concentrations indicated by 
arrows. Note that Lgr5* intestinal stem cells double every 19-20h 

in vitro*', comparable to the cell cycle length in vivo*? (21.5h). 
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The sexual identity of adult intestinal stem cells 
controls organ size and plasticity 


Bruno Hudry!, Sanjay Khadayate! & Irene Miguel-Aliaga! 


Sex differences in physiology and disease susceptibility are 
commonly attributed to developmental and/or hormonal factors, 
but there is increasing realization that cell-intrinsic mechanisms 
play important and persistent roles”. Here we use the Drosophila 
melanogaster intestine to investigate the nature and importance 
of cellular sex in an adult somatic organ in vivo. We find that the 
adult intestinal epithelium is a cellular mosaic of different sex 
differentiation pathways, and displays extensive sex differences 
in expression of genes with roles in growth and metabolism. Cell- 
specific reversals of the sexual identity of adult intestinal stem cells 
uncovers the key role this identity has in controlling organ size, 
reproductive plasticity and response to genetically induced tumours. 
Unlike previous examples of sexually dimorphic somatic stem cell 
activity, the sex differences in intestinal stem cell behaviour arise 
from intrinsic mechanisms that control cell cycle duration and 
involve a new doublesex- and fruitless-independent branch of the 
sex differentiation pathway downstream of transformer. Together, 
our findings indicate that the plasticity of an adult somatic organ 
is reversibly controlled by its sexual identity, imparted by a new 
mechanism that may be active in more tissues than previously 
recognized. 

Sex differences in intestinal physiology’ prompted us to investigate 
possible molecular underpinnings. RNA-seq transcriptional profiling 
of virgin adult midguts (GEO accession number GSE74775) indicated 
significant sexual dimorphism in gene expression and/or splicing, with 
8.3% of all genes and 5.6% of all isoforms expressed in the midgut 
displaying differences in expression between the sexes (Extended Data 
Fig. la-f and Supplementary Table 1). Sex-biased expression or splicing 
was confirmed for a subset of genes by real time qRT-PCR (Extended 
Data Fig. 1h, i). Genes with sex differences in expression cluster into 
distinct biological processes; genes assigned to cell division-related 
processes are more abundantly expressed in females, whereas genes 
coding for proteins that function in carbohydrate metabolism and 
redox processes are preferentially expressed in males (Extended Data 
Fig. 1g). 

The above results suggested active expression of sex determinants 
in the adult intestine. In Drosophila, sex chromosome sensing leads to 
sexually dimorphic expression of Sex lethal (Sxl): the master regulator 
of both sexual development and dosage compensation (DC, an epige- 
netic process by which transcription of the single male X chromosome 
is upregulated twofold*). Sx! controls the sex-specific splicing of its 
downstream target gene transformer (tra)°, leading to functional Tra 
protein expression only in females, and to the sex-specific splicing of 
two Tra direct targets—the transcription factors doublesex (dsx) and 
fruitless (fru)—that sculpt sexually dimorphic anatomical features, 
reproductive systems and behaviour* * (Extended Data Fig. 2b). As may 
be expected from their roles in DC‘, Sxl and the sexually dimorphic DC 
complex (DCC) member Male-specific lethal 2 (Msl-2) are ubiquitously 
expressed in the intestinal epithelium of females and males, respec- 
tively, in both adult intestinal progenitors (intestinal stem cells (ISCs) 
and postmitotic enteroblasts (EBs)) and their differentiated progeny 


(enterocytes (ECs) and enteroendocrine cells (EECs)) (Extended Data 
Fig. 2a-d). Consistent with our RNA-seq analysis, a newly generated 
tra-LacZ reporter is also expressed in all four epithelial cell types of 
the adult intestine (Extended Data Fig. 2a, e), suggesting that down- 
stream of Sxl, the sex differentiation pathway remains active in this 
adult epithelium. However, the two targets of Tra-mediated sex-specific 
splicing are either absent from all midgut epithelial cells (Fru) or are 
expressed in only a subset of epithelial cells (Dsx'™, expressed in ECs 
but absent from ISCs, EBs and EECs) (Extended Data Fig. 2a, f, g). 
Hence, the expression of sex determination genes is maintained in 
the adult midgut, but displays cell-type specificity; while adult-born 
enterocytes express all members of the canonical sex determination 
pathway, their siblings (the EECs) and both types of adult intestinal 
progenitors (ISCs and EBs) express the early (Sxl, tra), but not the late 
(Dsx!™, Pru™), effectors of the pathway. 

Both the enrichment analysis and the presence of Sxl/tra in adult 
ISCs pointed to sexually dimorphic ISC proliferation. Female flies 
exhibit a rapid proliferative response to dextran sodium sulphate 
(DSS)-induced damage of the intestinal epithelium (Fig. 1a and ref. 9). 
This response was less pronounced in male midguts (Fig. la) or in 
female (but not male) midguts following adult-restricted Sx] down- 
regulation in intestinal progenitors (Fig. la and Extended Data 
Fig. 3a, c). Conversely, ectopic expression of Sx/ in adult intestinal 
progenitors enhanced proliferation in male (but not female) mid- 
guts (Fig. la and Extended Data Fig. 3a). Additional cell-type- and 
adult-specific Sx! downregulation experiments indicated that Sxl acts 
in ISCs, and not in other cells, to control sexually dimorphic prolifer- 
ation rather than differentiation (Fig. le and Extended Data Figs 3d 
and 4a). Mechanistically, females do not have a significantly higher 
density of ISCs than male flies (Fig. 1b) or a higher ratio of symmetric 
versus asymmetric divisions (Fig. 1c), suggesting that the proliferative 
capacity of female ISCs is intrinsically enhanced by their expression 
of Sxl. Consistent with this idea, a higher percentage of their adult 
progenitors are found in G2/S phase at the expense of G1 in home- 
ostatic conditions (Fig. 1d), suggestive of shorter cell cycles, and 
adult-specific downregulation of Sx/ in intestinal progenitors abro- 
gated the sexual dimorphism in G2/S to G1 ratio, without affect- 
ing the number of ISCs or their division mode (Figs 1b-d). Clonal 
analyses further confirmed the intrinsic nature of the sexual dimor- 
phism in proliferation, its Sxl control and adult reversibility both 
during regeneration and homeostasis (Fig. 1g and Extended Data 
Fig. 3b, e and f). 

To investigate whether the reported Sx/ effects result from deregu- 
lated DC, we first confirmed that DC can be functionally inactivated 
in adults by observing loss of histone H4 lysine 16 acetylation of the 
X chromosome upon adult-specific downregulation of msl-2 in male 
intestinal progenitors (Extended Data Fig. 3g). We then investigated 
whether ectopic msl-2 expression accounted for the reduced prolifer- 
ation resulting from Sx/ downregulation by co-downregulating both 
genes in adult intestinal progenitors. This did not reinstate female 
proliferation (Fig. 1f). The converse experiment—misexpression of 
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msl-2 in adult female intestinal progenitors using a newly generated 
transgene coding for HA-tagged Msl-2—did not reduce their prolif- 
eration (Fig. 1f) despite efficient Msl-2 protein expression and func- 
tion (Extended Data Fig. 3h and data not shown). Hence, DC does not 
account for the ISC sex differences. 

This focused our attention on the sex differentiation pathway and its 
main effector tra. Like Sxl, tra downregulation reduced DSS-induced 
proliferation in females to levels comparable to those seen in male 
midguts, but did not affect proliferation in male midguts (Fig. 2a and 
Extended Data Fig. 5a). Conversely, tra misexpression, either ubiqui- 
tous (Extended Data Fig. 5c) or confined to adult intestinal progenitors 
(Fig. 2a and Extended Data Fig. 5a), increased the proliferative response 
of ISCs to DSS in adult males, but not in females. Clonal and tra 
mutant rescue experiments further confirmed the adult, cell-intrinsic 
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Figure 1 | SxI controls intrinsic sex differences in adult ISC proliferation 
independently of dosage compensation. a, Number of mitoses (phospho- 
histone H3 (pH3)-positive cells, top graph) and percentage of cells positive 
for an intestinal progenitor marker (escargot (esg)-positive/total area, 
bottom graph) in controls and flies with adult-restricted downregulation 
or misexpression of Sxl in ISCs/EBs (achieved by esg-Gal4, tub-Gal80"S- 
driven Sx! RNAi or UAS-Sxl, respectively). Flies were exposed to control 
(sucrose, S) or damage-inducing (DSS) diets. Representative images 

are shown to the right (DNA: DAPI, in blue; ISC/EB marker: GFP, in 
green). b, Stem cell number (esg”, Supressor of Hairless (Su(H))~ cells) 

in the posterior midgut following 20 days of adult ISC/EB-specific Sxl 
downregulation. c, Quantifications of symmetric (red) versus asymmetric 
(green) ISC divisions based on cortical Partner of Numb (Pon)-GFP 
distribution in metaphase and telophase reveal no differences between 

the sexes or upon adult ISC/EB-specific Sxl downregulation. Mitoses 

with lack of clear Pon-GFP signal are displayed in yellow. d, Percentage of 
progenitors in G1, S or G2 as revealed by ISC/EB-driven expression of the 
cell cycle indicator Fly-FUCCI. e, Number of mitoses in DSS-treated flies 
with adult-specific downregulation of Sxl in ISCs (esg-Gal4, Su(H)-Gal80 
driver), EBs (Su(H)-Gal4), ECs (midgut expression 1 (mex1)-Gal4) or EECs 
(prospero (pros)"'-Gal4). f, pH3 quantifications following adult-specific 
and simultaneous msl-2/Sxl downregulation or msl-2 misexpression in 
ISCs/EBs. g, MARCM clone size quantifications (graph) and representative 
images (labelled in green with GFP) reveal that clones expressing Sx/®N“i 
are smaller than control clones in females, but not in males. n denotes the 
number of midguts (a, e, f), ISCs/EBs (b, d), mitoses (c) or clones (g) that 
were analysed for each genotype. Results were combined from at least 

two independent experiments. See Supplementary Information for full 
genotypes. In this and all subsequent figures, error bars correspond to 
standard error of the mean (s.e.m.). 


requirement for tra in regulating sexually dimorphic proliferation, both 
during regeneration (Extended Data Fig. 5b, d) and in normal home- 
ostasis (Fig. 2c and Extended Data Fig. 5e, f). Strikingly, reintroduc- 
tion of a tra transgene specifically in adult intestinal progenitors fully 
rescued the reduced proliferation resulting from Sxl downregulation 
(Fig. 2b). Together with the Sxl experiments, these results show that the 
sex of the midgut is actively specified in adult flies. Unlike other adult 
somatic cell types®!%"!, adult ISCs have a plastic sexual identity man- 
ifested by an intrinsic, Sxl/tra-controlled and DC-independent sexual 
dimorphism in both their basal and regenerative proliferation. These 
findings extend previous work in both mouse and Drosophila!*-'® by 
showing that sexual identity not only needs to be actively maintained, 
but can also be reversed bidirectionally in the adult cells of a non- 
gonadal organ. 

Unexpectedly (and in contrast to tra), mutation of the canonical 
Tra binding partner Transformer 2 (Tra2)*’ failed to affect the sex 
differences in ISC proliferation (Fig. 2d), despite interfering with the 
sex-specific expression of dsx" andYolk protein 1 (Yp1) transcripts as 
anticipated (Extended Data Fig. 5g). Together with our finding that 
ISCs do not express Dsx or Fru™ (Extended Data Fig. 2a, f, g), this 
result suggested that a non-canonical Tra2-, Dsx-, Fru™-independent 
sex determination pathway drives sexually dimorphic proliferation in 
adult intestinal stem cells. We conducted a series of rescue, gain- and 
loss-of-function experiments using dsx" transgenes, dsx/fru mutants, 
RNAi transgenes and mutants in which dsx/fru splicing was shifted 
towards either male or female isoforms (Extended Data Fig. 6a-g). 
These ruled out both a direct action of dsx or fru in intestinal progen- 
itors as well as indirect contributions to ISC proliferation from other 
dsx/fruM-expressing cells such as the neighbouring ECs, thus confirm- 
ing that the sex of adult somatic stem cells in the intestine is specified 
by a novel branch of the sex determination pathway. 

We then combined new RNA-seq and genetic experiments to identify 
tra target genes in adult ISCs, defined as genes of which the expres- 
sion or splicing was different between control and tra mutant females, 
but returned to levels comparable to those of controls upon adult 
ISC-specific re-introduction of tra (see Methods, GEO accession 
number GSE74775 and Supplementary Table 1 for details). This 
yielded 72 genes with tra-regulated expression (34) or splicing (38) in 
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Figure 2 | tra, but not tra2, controls intrinsic sex differences in 

adult ISC proliferation. a, Graphs of mitoses (pH3, top) and intestinal 
progenitor area (esg>GEP, bottom) in flies exposed to control (sucrose, S) 
or damage-inducing (DSS) diets, in both controls and flies with 

adult ISC/EB-restricted tra downregulation or misexpression. 
Representative images are shown below the graphs (DNA: DAPI, in blue; 
ISC/EB marker: GFP, in green). b, Comparable quantifications of the Sx! 
downregulation phenotypes in females, and their rescue by re-expression 
of tra’. c, Clone size quantifications (in arbitrary units (a.u.) of GFP 
fluorescence, see Methods) reveal that tra null mutant MARCM clones are 
smaller than control clones only in females. d, No significant differences in 
ISC proliferation in the midguts of DSS-treated males or females lacking 
tra2 (tra2®/P§2Virx) or lacking tra2 specifically in adults (achieved by 
shifting flies with the thermosensitive allele tra2‘! from 18°C to 29°C 

in the adult stage) versus controls. n denotes the number of midguts 

(a, e, f), ISCs/EBs (b, d), mitoses (c) or clones (g) that were analysed for 
each genotype. Results were combined from at least two independent 
experiments. See Supplementary Information for full genotypes. 


adult intestinal progenitors (Fig. 3a, b and Extended Data Fig. 7a—c). 
A genetic screen (Extended Data Fig. 7d, see Methods for details) iden- 
tified three of them as negative or positive regulators of proliferation 
accounting for the sexual dimorphism in ISC proliferation. Indeed, 
downregulation of Imaginal disc growth factor 1 (Idgf1) or Serpin 88Eb 
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Figure 3 | tra targets in adult intestinal progenitors. a, b, Heat maps of 
the genes with tra-regulated expression (a) or splicing (b) in ISCs/EBs, 
displaying their normalized abundance in females, tra null mutant females 
and tra null mutant females with feminized ISCs/EBs (ese? >tra® ) (see 
Methods for details). c, Mitoses (pH3) quantifications following adult 
ISC/EB-confined manipulations of rdo, Idgfl and Spn88Eb expression. 
Representative images for each genotype are shown below the graphs 
(DNA: DAPI, in blue; ISC/EB marker: GFP, in green). n= 10 midguts per 
genotype/condition. Results were combined from at least two independent 
experiments. See Supplementary Table 1 for a full list of names and quality 
scores, and Supplementary Information for full genotypes. 


(Spn88Eb), normally activated by tra in adult intestinal progenitors, 
reduced proliferation only in females. By contrast, overexpression of 
reduced ocelli (rdo), normally repressed by tra in the same cells, reduced 
proliferation only in females (Fig. 3c and Extended Data Fig. 7e). These 
three targets are regulated at the expression—rather than splicing— 
level, consistent with novel, tra2- and, possibly, splicing-independent 
roles of tra. Once produced, they may be secreted or targeted to the cell 
surface'”"°, suggesting that sex differences may result from modulation 
of how ISCs interact with their local environment. All three proteins 
also appear to be expressed in other tissues and/or during develop- 
ment, and belong to protein families also represented in mammals: 
leucine-rich repeat (rdo), inhibitory serpins (Spn88Eb) and secreted 
glycoproteins (Idgf1)!”!*°. It will therefore be interesting to explore 
whether they account for how organisms attain their sexually dimor- 
phic body size during development. 

Are the sex differences in ISCs physiologically significant? Wild-type 
female midguts are longer than male midguts at days 3, 5 and 20 of adult 
life (Fig. 4a and data not shown). Adult-specific and cell-autonomous 
masculinization of ISCs (achieved by tra or Sx! downregulation) dra- 
matically shrank the female (but not the male) midgut towards a size 
more similar to that of males (Fig. 4a and Extended Data Fig. 8c). 
Cell type- and region-specific quantifications indicated that a reduc- 
tion in ISC proliferation and, consequently, the number of differenti- 
ated progeny accounted for the observed shrinkage (Extended Data 
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Figure 4 | Physiological importance of the sex differences in intestinal 
progenitors. a, Midgut length quantifications and representative images of 
phenotypes resulting from adult ISC/EB-specific masculinization of ISCs 
(achieved by esg!-driven tra downregulation initiated after the phase of 
midgut post-eclosion growth, see Methods for details). b, The number of 
mitoses (pH3-positive cells) is higher in control female flies 3 days after 
mating. The postmating increase is abrogated upon adult ISC/EB-specific 
tra downregulation. An EB marker (Su(H)-LacZ, in red in image panels) 
reveals that the EB expansion seen in females after mating is reduced 

upon adult ISC/EB-specific tra downregulation. See also Extended Data 
Fig. 8 for quantifications. c, pH3 quantifications inside MARCM clones 

of control flies, Su(H) mutants and Su(H) mutants in which Sxl has been 
downregulated inside the clone. Su(H) mutation only leads to increased 
pH3 counts in females, and this increase is Sxl dependent. d, Hyperplasia 
(quantified by the number of pH3-positive cells) resulting from adult ISC/ 
EB-driven Notch downregulation and its modulation by tra in female and 
male midguts. Confocal images show intestinal progenitor coverage of 
representative midgut portions for each genotype (DNA: DAPI, in blue; 
ISC/EB marker: GFP, in green). n denotes the number of midguts (a, b, c, d) 
that were analysed for each genotype. Virgin flies were used in all experiments 
unless otherwise indicated. Results were combined from at least two 
independent experiments. See Supplementary Information for full genotypes. 


Fig. 8a-d). Female flies with masculinized adult ISCs also failed to 
undergo the recently described midgut resizing triggered by mating”! 
(Fig. 4b and Extended Data Fig. 8e) and had slightly reduced fecundity 
(Extended Data Fig. 8f), indicating that sex differences endow females 
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with enhanced stem cell plasticity to optimise reproduction. Females 
are, however, more prone to genetically induced tumours; indeed, 
adult-specific interference with Notch (N) or Apc-ras signalling, previ- 
ously shown to lead to tissue overgrowth reminiscent of gastrointestinal 
tumours in females”””>, resulted in hyperplasia of female (but not male) 
midguts (Fig. 4c, d and Extended Data Fig. 9a-f). Hyperplasia resulted 
from sex differences in proliferation rather than sex-specific differ- 
entiation defects (Extended Data Fig. 4b), and could be prevented by 
simultaneously masculinizing female adult ISCs by downregulating or 
mutating Sxl/tra, but not dsx and fru™ (Fig. 4c, d and Extended Data 
Fig. 9a, c-g). Increased susceptibility of female flies to genetically 
induced tumours was also observed after mating (data not shown). 
Thus, the intrinsic sexual identity of adult intestinal stem cells plays key 
roles in adult life, both in maintaining organ size and in modulating 
its plasticity. 

Previous observations had pointed to additional branches of the 
sex determination pathway***°. Our discovery of this new branch 
has important implications for organs such as the nervous system, 
where sexual identity was thought to be confined to FruM- and/or 
Dsx-expressing neurons”, and raises the possibility that every cell has 
a sexual identity that actively regulates its plasticity and physiology. 
Although early sex specification in Drosophila differs from that of 
mammals, there is increasing evidence that it converges on common 
effectors such as the Dsx/Dmrt family of transcription factors and their 
targets'*. Hence, the recently reported sex differences in both intestinal 
gene expression and microbiota in the mammalian intestine?””’ could 
at least partly result from similar, so far unexplored, intrinsic sex 
differences. Similarly, the possible contribution of the intrinsic sexual 
identity of adult somatic stem cells to both organ plasticity and to the 
well documented sex differences in susceptibility to many types of 
cancer’? deserves further investigation. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Fly husbandry. Fly stocks were reared on a standard cornmeal/agar diet (6.65% 
cornmeal, 7.15% dextrose, 5% yeast, 0.66% agar supplemented with 2.2% nipagin 
and 3.4ml1"! propionic acid). All experimental flies were kept in incubators at 
25°C, 65% humidity and on a 12h light/dark cycle, except for those containing 
tub-Gal80"S transgenes, which were set up at 18°C (restrictive temperature) and 
transferred to 29°C (permissive temperature) at the time when Gal4 induction 
was required. This depended on the specific experimental requirements but typ- 
ically, for loss-of-function (RNAi) or gain-of-function (UAS) experiments, flies 
were raised and aged as adults for 3 days at 18°C, were then shifted to 29°C to 
induce transgene expression, and adult midguts were dissected after 10-20 days 
(as indicated in each figure panel). Flies were transferred to fresh vials every 3 days 
and fly density was kept to a maximum of 15 flies per vial. Virgin flies were used 
for all experiments unless indicated otherwise. 

For mutant ISC clonal analyses (MARCM clones), 3-day-old adults (raised and 
aged at 25°C) were heat-shocked for 1 h at 37°C to induce clones, and were then 
kept at 25°C (or 29°C for MARCM RNAi clones) until dissection (10 days, 15 days 
or 4 weeks thereafter as indicated in each figure panel). Flies were transferred to 
fresh vials every 3 days. 

For damage-induced regeneration assays, virgin flies were collected over 72 h at 
18°C, and were then shifted to 29°C for 7 days on standard media. Flies were then 
transferred in an empty vial containing a piece of 3.75cm x 2.5 cm paper. 500ml 
of 5% sucrose solution (control) or 5% sucrose + 3 dextran sulphate sodium (DSS) 
solution was used to wet the paper, used as feeding substrate. Flies were transferred 
to new vial with fresh feeding paper every day for 3 days prior dissection. 

For fecundity experiments, females were raised and aged as virgins for 3 
days at 18°C, and were then shifted to 29°C to induce transgene expression 
for 10 days. Females were then mated overnight to OregonR males (10 males, 
10 females per vial). Males were then removed, and single female flies were 
transferred individual vials every 24 h for a 3-day period. Eggs were counted 
from the vacated vials. 

Fly stocks. UAS transgenes. UAS-SxI®N“i (VDRC: GD 3131), UAS-Sxl (gener- 
ated in ref. 31), UAS-pon.GFP (BDSC:42741, generated in ref. 32), UAS-GFP. 
E2f1."- 7°, UAS-mRFP1.NLS.CycB.’-?% (BDSC: 55121, generated in ref. 33), UAS- 
GEP (BDSC: 35786), UAS-Dicer2 (VDRC: 60007), UAS-tra®N“' (BDSC: 28512, 
TRiP.JF03132), UAS-tra* (BDSC: 4590, generated in ref. 34), UAS-msl-2®N*i 
(VDRC: GD 29356), UAS-SxI®N4i2 (VDRC: KK 109221), UAS-ms]l-28NAi 2 
(BDSC:35390, TRiP.GL00309), UAS-msl-2"4 (this study), UAS-dsx®%4' (BDSC: 
26716, TRiP.JF02256), UAS-dsx" (BDSC: 44223, generated in ref. 35), UAS-Arrl1 
(generated in ref. 36), UAS-rdo (generated in ref. 17), UAS-GstE4 (generated in 
ref. 37), UAS-Notch®N“' (VDRC: KK 100002), UAS-SxI®“+ (BDSC: 34393, TRiP. 
HMS00609), UAS-tra®\“'? (VDRC: GD 2560), UAS-tra®N“‘3 (NIG: 16724-R2), 
UAS-fru®\™ (BDSC: 31593, TRiP.JFO1182), UAS-Notch®S4i? (VDRC: GD 27229), 
UAS-Hairless (generated in ref. 38), UAS-Ni™"”@ (generated in ref. 39), UAS-sspitz 
(gift from J. Treisman, generated in ref. 40), UAS-Ras’!?; FRT82B line (generated 
in ref. 41). 

Mutants. tra!, FRT2A, fru”!-Gal4 chromosome (generated in ref. 42), FRT82b, dsx! 
chromosome (generated in ref. 43), dsx, Df(3R)"*"""!”? chromosome (generated 
in ref. 44), Df(3R)dsx'! (BDSC: 1865, generated in ref. 45), In(3R)dsx”> (BDSC: 
1849, generated in ref. 46), FRT82b, dsx', fruwe chromosome (gift from B. Baker, 
generated in ref. 47), dsx!, fru? 1-LexA chromosome (gift from B. Prudhomme, 
generated in ref. 48), tra®° (this study), Df(3L)*/” (BDSC:5416, generated in 
ref. 49), tra’ (BDSC: 675, generated in ref. 50), tra2® (BDSC: 25137, generated in 
ref. 51), Df(2R)'"* (BDSC:1896, generated in ref. 52), tra2'*! (BDSC: 2413, gener- 
ated in ref. 52), dsx? (BDSC: 840, generated in ref. 53), fru” and fru™ (gift from 
G. Jefferis, generated in ref. 47), frut”? (gift from G. Jefferis, generated in ref. 54), 
FRT82b, DI***F!° chromosome (gift from S. Bray, generated in ref. 55), Su(H)"”, 
FRT40A chromosome (gift from S. Bray, generated in ref. 56), FRT82B Apc2™! wok, 
Apc® chromosome and hs-flp; UAS-Ras""”; FRT82B Apc2™!%, Apc line (both 
generated in ref. 41). 

Reporters and Gal4 drivers. Su(H)-LacZ (generated in ref. 57), esg-GFPP01786 
(gift from L. Jones, generated in ref. 88), esg-LacZ*°°6 (BDSC: 10359, generated 
in ref. 89), dsx-Gal4 (generated in ref. 58), fru? 1_Gal4 (gift from G. Jefferis, gen- 
erated in ref. 59), esg-Gal4N?”°7, UAS-GFP, Tub-Gal807* chromosome (gift from 
J. de Navascués), Su(H)“""-Gal80 (generated in ref. 60), Su(H, )CBE_Gal4 (generated 
in ref. 61), mex1-Gal4 (generated in ref. 62), pros’!-Gal4 (generated in ref. 63), 
MvI-Gal4®??375(Kyoto: 104178), dsx“?-Gal4 (generated in ref. 64), vm-Gal4 
(BDSC: 48547, GMR13B09-GAL4), btl-Gal4 (generated in ref. 65), nSyb-Gal4 
(BDSC: 51941, generated in ref. 90), elav-Gal4 (BDSC: 8765, generated in ref. 91), 
nSyb-Gal80 (gift from J. Simpson), stripe-Gal4 (BDSC: 26663, generated in ref. 92). 
MARCM stocks. FRT40A: w, hs-flp, Tub-Gal4, UAS-GFP; tub-Gal80, FRT40A (gift 
from J. de Navascués), FRT40A (gift from J. de Navascués, generated in ref. 93), 
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FRT2A: y, w, hs-flp; Tub-Gal4, UAS-GFP; Tub-Gal80, FRT2A (gift from I. Salecker), 
FRT2A (BDSC: 1997, generated in ref. 94), FRT82b: w, hs-flp; UAS-mCD8GFP, 
Tub-Gal4; FRT82b, Tub-Gal80 (gift from M. Vidal), FRT82b (BDSC:7369, gen- 
erated in ref. 93). 

Generation of tra®° and tra! transgenic lines. To generate the trans- 
genic reporter of tra promoter activity tra’°“, the upstream inter-genic 
region between spd-2 and tra was cloned using the following primer pairs: 
5’-AAAATCTAGAAACTAATAAAGTATATGAG-3/ and 5’-AAAAGGTA 
CCCGGAAAATGCTGGAAATTAAATGATGC-3’. PCR was performed with Q5 
high-fidelity polymerase from New England Biolabs (M0491S). The PCR product 
was digested with Xbal and KpnI, and was then cloned into the pH-Pelican attb vector 
(ref. 66, gift from C. Thummel). The construct was sequence-verified and a trans- 
genic line was established through ®C-31 integrase mediated transformation 
(Bestgene, attP site: attP40). 

The new amorphic allele of tra, tra®°, was generated using the accelerated 
homologous recombination method recently developed by Baena-Lopez®”. The 
5! (4,302 nt) and 3’ (2,928 nt) homology arms were produced from a tra BAC 
(BAC-PAC resources: CH322-140P01) with the following primer pairs respectively: 
5’/-AAAAGCGGCCGCCATTCTACTCTTGAATTGGCTAGC-3//5/-AAAAGTA 
CCATGATGCACTTTCCTCAGTGTGA-3’ and 5‘-AAAAGGCGCGCCAAG 
AGAATACCATGG-3//5'-AAAAGGCGCGCCATTGTCGACACAATCAAACTG-3’. 
PCRs were performed with Q5 high-fidelity polymerase from New England Biolabs 
(M0491S). The PCR products were digested with NotI/KpnI or Ascl respectively, 
and were then cloned into the pT V“*"" vector to generate pTV™" [tra]. The 
vector was sequence-verified and inserted into random genomic locations by 
P-element-mediated transformation (Bestgene). Transformants (not necessarily 
mapped or homozygosed) were crossed to hs-FLP, hs-Scel flies (DBSC: 25679) 
and the resulting larvae were heat-shocked at 48, 72, 96 and 120 h after egg laying 
for 1 h at 37°C. Approximately 200 adult females with mottled eyes (indicating 
the presence of pT V“*""[tra] and the transgenes carrying hs-FLP and hs-Scel) 
were crossed in pools of 15 to ubiquitin-Gal4[3xP3-GFP] males and the progeny 
was screened for the presence of red-eyed flies. The ubiquitin-Gal4[3 «P3-GFP] 
transgene was subsequently removed by selecting against the presence of GFP in 
the ocelli. The generated deletion removed 342 nucleotides starting 13 nucleotides 
upstream of the transcription start site. 

Generation of the UAS-msl-2™ line. To overexpress msl-2, a transgenic UAS line 
was generated from msl-2 cDNA adding an HA tag at the C-terminal (BDGP Gold 
cDNAs, clone ID: GH22488) with the following primer pair: 5‘-AAA AAGATC 
TATGGCTCAGACGGCATACTTG and 5/-AAAATCTAGATTAAGCGTAATC 
TGGAACATCGTATGGGTACAAGTCATCCGAGCCCGAC-3’. PCR was per- 
formed with Q5 high-fidelity polymerase from New England Biolabs (M0491S). 
The PCR product was digested with BglII and Xbal before cloning into the 
pUASTattb vector®. The construct was sequence-verified and a transgenic line 
was established through ®C-31 integrase mediated transformation (Bestgene, attP 
site ZH-86Fb, DBSC: 24749). 

Immunohistochemistry. Intact guts were fixed at room temperature for 20 min in 
PBS, 3.7% formaldehyde. All subsequent incubations were done in PBS, 4% horse 
serum, 0.2% Triton X-100 at 4°C following standard protocols. 

The following primary antibodies were used: mouse anti-Sxl (M114, DSHB 

Hydridoma) 1:50, mouse anti-Sxl (M18, DSHB Hydridoma) 1:50, goat anti-Msl-2 
(dC-20, sc-32459, Santa Cruz Biotechnology) 1:50, chicken anti-beta galactosi- 
dase (ab9361, Abcam) 1:200, rabbit anti-phospho-histone H3 Ser10 (9701L, Cell 
Signalling Technology) 1:500, rabbit anti-fru™ (Male-2, generated in ref. 69) 
1:500, mouse anti-Dsx?8> (DSHB Hybridoma) 1:100, mouse anti-Pros (MR1A, 
DSHB Hybridoma) 1:50, goat anti-ac-Histone H4 Lys16 (sc-8662, Santa Cruz 
Biotechnology) 1:500, rat anti-HA (11867423001, Roche) 1:500, mouse anti-GFP 
(11814460001, Roche) 1:1,000, mouse anti-Pdm1 (kind gift of Steve Cohen, gen- 
erated in ref. 70) 1:20. Fluorescent secondary antibodies (FITC-, Cy3- and Cy5- 
conjugated) were obtained from Jackson Immunoresearch. Vectashield with DAPI 
(Vector Labs) was used to stain DNA. 
Cell, clone and midgut length quantifications. Mitotic indices were quantified 
by counting phospho-Histone H3-positive cells in >10 midguts per genotype, 
time point, and/or condition (for example, male/female), and are displayed as 
means + standard error of the mean (s.e.m.). For posterior midgut cell counts, a 
midgut portion immediately anterior to the junction with the tubules and hindgut 
was imaged at 20x magnification. Cells were counted using ImageJ in areas of 
identical size across all genotypes to control for size differences. Threshold was 
adjusted for the GFP channel (ImageJ function: Image > Adjust > Threshold) to 
subtract background, and the percentage of area above the threshold was con- 
sidered (ImageJ function: analyse particles). Data was collected from at least 
10 midguts per genotype and/or condition, and is displayed as mean of percentage 
of GFP-positive area + s.e.m. 
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MARCM clones were quantified as number of cells per clone (when the GFP 
reporter was expressed in the nucleus) or by the size of the GFP area/clone (in 
arbitrary units, when the reporter was membrane-tagged GFP). 

To measure midgut length, guts were dissected and were then straightened on 
polylysine-coated slides. After imaging, a line was drawn between the most ante- 
rior point of the proventriculus and the midgut-hindgut junction using Image]. 
The number of pixels contained in the line was used as a proxy for midgut length. 
For display purposes, the value obtained for a control female midgut selected at 
random was used as reference to normalize all the other values, which are shown 
as percentage of control female length value. 

Functional screen of tra targets. To test the functional significance of tra tar- 
gets, only those with female/tra null mutant female fold differences in transcript 
abundance >+2 that were also under tra control in adult intestinal progenitors 
were selected for functional validation. The contribution of these tra targets to 
sex differences in adult ISC proliferation was investigated using publicly avail- 
able RNAi and UAS lines (see below for details of the RNAi lines, and Methods 
section ‘Fly stocks’ for details of the UAS lines) expressed from esg-Gal4". Results 
obtained with 11 out of the 16 lines belonging to the VDRC KK collection were 
not considered and are not shown in Extended Data Fig. 7 because their expres- 
sion in adult ISCS from esg’’-Gal4 resulted in a recurrent ISC differentiation 
phenotype in both males and females. We attribute this effect to the previously 
reported dominant Gal4-dependent toxicity issue with this VDRC KK collec- 
tion’!). The genes from the GD library were /dgf1 (transformant ID 12416); 
Spn88Eb (transformant ID 28425); CG8008 (transformant ID 4159); CG4500 
(transformant ID 34852) ; Arr1 (transformant ID 22196); Spn47C (transformant 
ID 25534); Cyp12d1-p (transformant ID 49269); TwdlE (transformant ID 24867); 
Cyp12d1-d (transformant ID 50507); GstE2 (transformant ID 32945); GstE4 
(transformant ID 20472); lectin-28C (transformant ID 45634); AANATL2 (trans- 
formant ID 44677); CG5348 (transformant ID 1698); CG15128 (transformant ID 
19302) and Ir10a (transformant ID 45403). The genes from the KK library were 
CG17470 (transformant ID 100414); Klp54D (transformant ID 100140); CG4500 
(transformant ID 106260); Arr1 (transformant ID 109860); rdo (transformant ID 
107213); Spn47C (transformant ID 105933); Cyp12d1-p (transformant ID 109256); 
TwadlE (transformant ID 107483); Cyp12d1-d (transformant ID 109248); GstE4 
(transformant ID 100986); lectin-28C (transformant ID 104290); AANATL2 (trans- 
formant ID 102802); CG5348 (transformant ID 106565); CG15128 (transformant 
ID 100238); CG15236 (transformant ID 101045) and Irl0a (transformant ID 
100181). 

Statistics and data presentation. All statistical analyses were carried out in the 
R environment””. Comparisons between two genotypes/conditions were ana- 
lysed with the Mann-Whitney- Wilcoxon rank sum test (R function wilcox.test). 
All graphs were generated using Adobe Illustrator. All confocal and bright field 
images belonging to the same experiment and displayed together in our figure 
were acquired using the exact same settings. For visualization purposes, level and 
channel adjustments were applied using ImageJ to the confocal images shown 
in the figure panels (the same correction was applied to all images belonging to 
the same experiment), but all quantitative analyses were carried out on unad- 
justed raw images or maximum projections. In all figures, n denotes the number 
of midguts, ISCs/EBs, mitoses or clones that were analysed for each genotype. 
Values are presented as average + standard error of the mean (s.e.m.), P values from 
Mann-Whitney-Wilcoxon test (non-significant (NS): P > 0.05; *0.05>P> 0.01; 
**0.01>P > 0.001; ***P > 0.001). Lines and asterisks highlighting significant com- 
parisons across sexes are displayed in red, whereas those highlighting significant 
comparisons within same-sex data sets are displayed in black. 

The number of Drosophila animals was not limiting, so no statistical methods 
were used to predetermine sample size. The experiments were not randomized. 
RNA-seq experiments. RNA extraction. RNA from 30 pooled dissected midguts 
was extracted using TRIzol (Invitrogen), and 3 such samples were used for each 
sex and genotype. RNA-seq libraries were prepared from 500ng of total RNA 
using the Illumina Truseq mRNA stranded library prep kit (Illumina, San Diego, 
USA) according to the manufacturer’s protocol. Library quality was checked 
on a Bioanalyser HS DNA chip and concentrations were estimated by Qubit 
measurement. Libraries were pooled in equimolar quantities and sequenced 
on a Hiseq2500 using paired end 100bp reads. At least 30 million reads pass- 
ing filter were achieved per sample. After demultiplexing, raw RNASeq reads 
were aligned with Tophat splice junction mapper’’, version 2.0.11 against 
Ensembl Drosophila genome reference sequence assembly (dm3) and transcript 
annotations. 

Differential gene expression analysis. For differential gene expression analysis, 
gene-based read counts were then obtained using HTSeq count module (version 
0.5.3p9)”4, Differential expression analysis was performed on the counts data using 
DESeq2 Bioconductor package”. The analysis was run with the default param- 
eters. DESeq2 package uses negative binomial model to model read counts and 


then performs statistical tests for differential expression of genes. Raw P values 
were then adjusted for multiple testing with the Benjamini-Hochberg procedure. 
GO enrichment analysis was performed using FlyMine v40.1.b. In Fig. 3a, b and 
Extended Data Fig. 1g, each column corresponds to one of three different repli- 
cates (30 midguts each) for each sex, and transcript abundance for each gene was 
normalized to a scale of —2 (white) to +2 (grey). 

Isoform expression analysis. Reads were first aligned to the transcript sets using 
Bowtie software (version 1.0.0)’°. Transcript annotations for cDNA and non-coding 
RNAs were obtained from Ensembl (release 75). Isoform abundances were then 
calculated using mmsegq R package (version 1.0.8)’’. Differential expression anal- 
ysis was performed on these abundances using the DESeq2 Bioconductor package 
(version 1.42.1)’®. Isoforms not expressed in any of the samples were not included 
in the differential expression analysis. Raw P values were adjusted for multiple test- 
ing with the Benjamini-Hochberg procedure. Initial isoform analysis identified dif- 
ferential expression for 1,379 isoforms between male and female midguts (P < 0.05 
cutoff), which, following subtraction of those corresponding to single-isoform 
genes, yielded 704 isoforms (see main text for subsequent analysis). 

RNA-seq data displays. For the volcano plots in Extended Data Fig. 1, log, fold 
change values were plotted against log) of adjusted P values. Genes and isoforms 
significantly upregulated in males are coloured blue, while those significantly 
upregulated in females are coloured red. Selected genes and isoforms are further 
highlighted as empty circles. For the scatter charts with quadrants in Extended 
Data Fig. 7, logs fold change values (between control and tra mutant females) 
were plotted against log, fold change values (between tra mutant females and tra 
mutant females and female with adult-specific, progenitor-specific tra expression). 
Genes and isoforms significantly repressed by tra are coloured blue, whereas those 
significantly activated by tra are coloured red. Selected genes and isoforms are fur- 
ther highlighted as empty circles. Volcano plots and scatter charts were generated 
using Adobe Illustrator. To generate the heat maps, a matrix containing the relative 
values of gene/isoform expression was built. Transcript abundance for each gene/ 
isoform was normalized to a scale ranging from —2 to +2. A hierarchical clustering 
algorithm (with Euclidian distance and average linking) was applied to the matrix 
using the MeV software suite”. Area-proportional Venn diagrams were generated 
using BioVenn (http://www.cmbi.ru.nl/cdd/biovenn/). 

Real time qRT-PCRs. RNAs were extracted from 30 dissected midguts using 
TRIzol (Invitrogen). RNAs were cleaned using RNAeasy mini Kit (QIAGEN), and 
cDNAs were synthesized using the iScript cDNA synthesis kit (Bio-Rad) from 
300ng of total RNAs. Quantitative PCRs were performed by mixing cDNA samples 
(5ng) with iTaq Universal SYBR Green Supermix (Bio-Rad, #172-5124) and the 
relevant primers in a 96-well plate. Expression values were normalized to eIF4G. 
For each gene/isoform, at least 3 independent biological replicates were used, and 
2 technical replicates were performed. See Supplementary Information for a list 
of all primers used. 
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Extended Data Figure 1 | Sexually dimorphic transcription and splicing 
in the adult midgut. a, Number and percentage of genes with sexually 
dimorphic gene expression, as revealed by RNA-seq transcriptional 
profiling of virgin male and female dissected midguts (P < 0.05 cutoff). 

b, Volcano plot displaying all genes with detectable midgut expression. 
Female/male ratio of gene expression is shown on the x axis (in log, 

scale) and significance is displayed on the y axis as the negative logarithm 
(logio scale) of the adjusted P value. Genes with significantly upregulated 
(P < 0.05 cutoff) expression in males and females are coloured in blue and 
red, respectively. Other genes are displayed in black. Genes with known 
sex-specific transcription are displayed as red (female-enriched) or blue 
(male-enriched) open circles. c, d, Comparable analyses for sex-biased 
isoforms belonging to genes with multiple transcripts. We identifed 714 
sex-biased isoforms belonging to a total of 603 genes. Isoforms resulting 
from known sex-specific alternative splicing are displayed as in panel b. 

e, Female/male ratios of overall transcript abundance (left graph) and 
abundance of sex-biased isoforms (right graph) for the members of the 
Drosophila sex determination pathway as revealed by RNA-seq analysis of 


the adult midgut. We note a sexual dimorphism in dsx transcript levels. 

f, Venn diagram illustrating the overlap between the genes showing 
sex-biased expression (overall transcript abundance, light grey, 1,305 genes) 
and sex-biased alternative splicing (sex-biased isoforms, dark grey, 

603 genes) in the adult midgut. Known members of the sex determination 
pathway are displayed as examples. g, Heat maps displaying genes with 
sexually dimorphic expression clustered by enrichment in specific 
biological processes, as revealed by Gene Ontology enrichment analysis. 
Genes with sexually dimorphic expression belonging to the top 4 enriched 
biological processes are shown. h, i, Real-time qRT-PCR data for a subset 
of genes for which RNA-seq transcriptional profiling experiments revealed 
sexually dimorphic expression (h) or isoforms (i). RNA was obtained from 
midguts from virgin male and female samples (same genotypes as for the 
RNA-seq experiments). For each gene/isoform, expression abundance was 
arbitrarily set up at 100% for the sex with the highest expression level, and 
percentage of that expression is displayed for the other sex. See Methods 
for details and Supplementary Table 1 for a full list of names and quality 
scores, and Supplementary Information for full genotypes. 
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Extended Data Figure 2 | See next page for figure caption. 
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Extended Data Figure 2 | Cell type-specific expression of sex 
determinants in the adult intestinal epithelium of virgin flies. a, In the 
adult Drosophila midgut, resident stem cells (ISCs) and their postmitotic 
daughter cells (EBs) maintain the adult intestinal epithelium during 
normal homeostasis and regenerate it after injury by giving rise to two 
types of differentiated progeny: ECs and EECs*”*!. The posterior midgut 
area used to visualize and quantify phenotypes is boxed. The following 
Gal4 drivers were used to label and/or genetically manipulate these four 
different cell types present in the adult intestinal epithelium: mex1 for 
ECs, pros’! for EECs, esg for both ISCs and EBs, and Supressor of Hairless 
(Su(H)) for EBs alone. b, The canonical sex determination pathway in 
somatic cells of Drosophila melanogaster, consisting of a cascade of 
sex-specific alternative splicing events culminating in the production of 
sex-specific transcription factors encoded by Dsx and Fru™. In females, 
Sex lethal (Sxl) is activated and regulates the splicing of transformer (tra) 
pre-mRNA, resulting in the production of Tra*. Tra* regulates the female- 
specific splicing of dsx pre-mRNA (dsx") and fru transcript coming from 
the P1 promoter (fru"’, giving rise to fru™). In males, Sxl is not expressed 
and no functional Tra is produced, resulting in default splicing of dsx and 
fru pre-mRNAs, leading to Fru and Dsx™ proteins, respectively. The 
resulting male- and female-specific Dsx and Fru isoforms confer sexual 
identity to the cells in which they are produced. In addition, in females, Sxl 
represses dosage compensation by inhibiting Msl-2 expression. The tables 
summarize the cell-specific expression profiles of the sex determinants in 
adult midguts of virgin males and females, shown in the panels below. 

c, Sxl protein (in red) is expressed only in female midguts. Co-staining 
with ISC/EB reporters indicates that Sxl is found esg-positive progenitors 
(ISCs: GFP-positive and LacZ-negative cells, and EBs: GFP-positive 

and LacZ-positive cells). It is also expressed in female polyploid ECs 


(GFP- (in green) and LacZ- (in blue) negative cells). Co-staining with 
pros”! reporter indicates that it is also expressed in EECs. d, Msl-2 protein 
is found in the same cell types only in males (staining is confined to the 

X chromosome, consistent with the signal observed in non-intestinal 
tissues*”). e, A new reporter of tra promoter activity (tra-LacZ, see 
Methods for details) is broadly expressed in the epithelium of both male 
and female midguts, including ISCs and EBs (as revealed by co-staining 
with esg-Gal4-driven GFP) and ECs (GFP-negative cells with large nuclei). 
Co-staining with pros”! reporter indicates that it is also expressed in EECs. 
f, A dsx-Gal4 reporter (visualized with a GFP reporter that has been false- 
coloured in red for consistency with the other panels) is active in male and 
female polyploid ECs (LacZ-negative cells), and is inactive in esg-positive 
progenitors (LacZ-positive cells, in green). Dsx protein (visualized using 

a Dsx?8?- specific antibody in red) is expressed strongly only in males 

bot not females, indicating that the sexual dimorphism in dsx transcript 
levels found in our RNA-seq analyses (Extended Data Fig. le) is further 
enhanced at the protein level. Co-staining of the same antibody with the 
EC marker mex1-Gal4 confirms expression in ECs. Cytoplasmic dsx-Gal4- 
driven expression of an mTomato reporter is apparent in ECs (visualized as 
large cells by co-staining with the membrane-enriched marker Armadillo), 
but is absent from EECs (as revealed by the gaps in mTomato expression 

in cells that are labelled with Pros. g, The fru?!-Gal4 reporter (which labels 
the only sexually dimorphic fru transcript that gives rise to Fru protein) 
is inactive in both male and female midguts, as revealed by lack of GFP 
signal (false-coloured in red for consistency with other panels). Consistent 
with the lack of fru?!-Gal4 expression, a Fru™-specific antibody (in red) 

is not expressed in the male midgut. An independent dsx reporter 
(dsx“?-Gal4) is expressed in polyploid ECs, consistent with the data 
displayed in f. See Supplementary Information for full genotypes. 
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Extended Data Figure 3 | See next page for figure caption. 
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Extended Data Figure 3 | Sx! controls intrinsic sex differences in 

adult ISC proliferation independently of dosage compensation. 

a, Immunocytochemistry using a Sxl-specific antibody indicates that adult- 
restricted downregulation of Sx/ in intestinal progenitors (ISCs/EBs)— 
achieved by Gal80'S-controlled expression of a Sx! RNAi transgene— 
efficiently downregulates Sxl expression in progenitors, but not large 
polyploid ECs. Conversely, efficient ectopic Sxl protein expression is 
obtained by expression of a UAS-Sxl transgene in adult ISCs/EBs of male 
flies. In all panels Sxl antibody is in red; DNA: DAPI, in blue; ISC/EB 
marker: GFP, in green. b, Quantifications of the number of cells inside 
control MARCM*? clones, or MARCM clones expressing RNAi transgenes 
directed against Sxl following DSS treatment. 5 days after clone induction 
by heat shock, female clones are larger than male clones only when Sx] is 
present, confirming the cell autonomy of Sxl action. c, Additional controls 
for Fig. 1a, and confirmation of phenotypes using an independent RNAi 
transgene. This second RNAi transgene against Sx/ (different from the one 
used in Fig. 1a) reduces the number of pH3-positive cells in DSS-treated 
female midguts when expressed from esg’® in adults ISCs/EBs, confirming 
an adult progenitor-specific requirement for Sx! in promoting damage- 
induced cell divisions in female flies. d, Adult-specific downregulation of 
Sxl in adult visceral muscle (using the vm driver), trachea (btl-Gal4 and 
DSRF-Gal4), neurons (nSyb-Gal4, Elav-Gal4), or fat body (stripe-Gal4) 
does not reduce DSS-induced ISC proliferation in females, By contrast, 
Sx! downregulation using an ISC/EB driver with suppressed neuronal 
expression (esg-Gal4 combined with nSyb-Gal80) effectively reduces DSS- 
induced ISC proliferation in females. Together, these results indicate that 
Sxl does not control sexually dimorphic DSS-induced ISCs proliferation 
from non-ISC cells. e, MARCM clones expressing a third RNAi transgene 


against Sxl (distinct from those used in Fig. 1 and above) are smaller 

than control clones in females, whereas their size is comparable to that of 
wild-type or Sxl-RNAi clones in males. This confirms that, during normal 
homeostasis, female ISCs divide more often than male ISCs because 

of the cell-autonomous action of Sxl. The graph shows quantifications 

of the number of cells within each clone 15 days after clone induction 

by heat shock, and the confocal images show representative clones 
(labelled in green with GFP) for each genotype. f, Clonal analyses of 
homeostatic proliferation using the inducible esg flip-out system (which 
labels progenitors and their progeny generated within a defined temporal 
window**) in midguts of control males, control females and females in 
which Sx! downregulation has been confined to adult progenitors. 15 days 
after induction, the size (assessed as the percentage of GFP-positive area) 
of control female clones was significantly larger than that of male clones, 
but both became comparable upon adult-specific Sxl downregulation 
using Sxl RNAi transgenes. The graph shows area quantifications for 

each sex/genotype, and the confocal images show representative clones 
for each genotype. g, Immunohistochemical detection of histone H4 
lysine 16 (H4Lys16) acetylation (in red) indicates that adult-specific 
downregulation of ms/-2 in male intestinal progenitors (ISCs/EBs marker: 
GFP, in green) results in loss of H4Lys16 acetylation of the X chromosome. 
h, Efficient Msl-2 misexpression in adult female intestinal progenitors 
(ISCs/EBs marker: GFP, in green) is confirmed by immunocytochemistry 
using an HA-specific antibody (in red). n denotes the number of guts 

(c, d, f) or clones (b, e) that were analysed for each genotype. Results were 
combined from at least two independent experiments. See Supplementary 
Information for full genotypes. 
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Extended Data Figure 4 | Sexually dimorphic proliferation does 

not result from sex differences in differentiation. a, Markers for 

all four intestinal cell types are still apparent following adult-specific 
downregulation of Sx/ in the intestinal progenitors of females—achieved 
by Gal80"S-controlled expression of a Sxl RNAi transgene. Indeed, 
expression of esg-Gal4 (ISC/EBs), Su(H)-LacZ (EBs), Pdm1 (ECs) and 
Pros (EECs) can be readily detected, suggesting that Sxl downregulation in 
females (which results in reduced ISC proliferation) does not have a major 
effect on differentiation. Sxl staining confirms efficient downregulation 

in ISCs/EBs, but not neighbouring cells. b, The same markers reveal 

that the differentiation defect resulting from Notch downregulation, 
previously reported in females, is also apparent in males (note loss of 
Su(H)-LacZ following Notch downregulation in both males and females), 


suggesting that sex differences in differentiation do not contribute to the 
sex differences in susceptibility to Notch-induced tumours. Co-expression 
of a mitogen (secreted Spitz, sSpi) abrogates the sex differences in tumour 
susceptibility by efficiently triggering hyperplasia also in males, as 
revealed by an expanded progenitor (GFP-positive) area in both males and 
females. The identity of these tumours in males is also comparable to that 
previously show for Notch tumours in females (consisting of high Pros- 
positive EEC-like cells and low Pros-positive neoplastic ISC-like cells®°. 
This further suggests that the sex differences in Notch-induced tumour 
susceptibility do not arise from sexually dimorphic differentiation effects, 
but result from sex differences in ISC proliferation. See Supplementary 
Information for full genotypes. 
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Extended Data Figure 5 | See next page for figure caption. 


Extended Data Figure 5 | tra, but not tra2, controls intrinsic sex 
differences in adult ISC proliferation. a, Additional controls for Fig. 2a, 
and confirmation of phenotypes using independent RNAi transgenes. 
Two additional RNAi transgenes against tra reduce the number of pH3- 
positive cells in DSS-treated female, but not male, midguts when expressed 
from esg’® in adults ISCs/EBs, confirming an adult progenitor-specific 
requirement for tra in promoting damage-induced cell divisions in female 
flies. b, tra! MARCM mutant clones are smaller than control clones in 
females, whereas their size is comparable to that of wild-type or tra! 
clones in males. This confirms that female ISCs divide more often than 
male ISCs because of the cell-autonomous action of tra. The graph shows 
quantifications of clone size (in arbitrary units of GFP fluorescence, as 
described in Methods) 15 days after clone induction by heat shock, and 
the confocal images show representative clones (labelled in green with 
GFP) for each genotype. c, Ubiquitous, adult-restricted tra’ expression 
from tub-Gal80"° in males increases the number of pH3-positive cells 
following DSS treatment to levels comparable to those of female flies. 

d, Re-introduction of this tra” transgene specifically in adult ISCs/EBs 
rescues the reduced, male-like intestinal proliferation (as assessed by the 
number of pH3-positive cells) of tra null mutant females entirely lacking 
the tra gene from all their tissues (tra°/tra') to levels comparable to those 
of control females. Expression of this transgene in control heterozygous 
female flies (tra°/+ esg™ >traF) does not significantly increase their 
proliferation (in fact, it reduces it slightly relative to traX°/+ esg!S > 
controls, likely as a consequence of its overexpression). e, Clonal analyses 


LETTER 


of homeostatic proliferation using the inducible esg flip-out system 
(which labels progenitors and their progeny generated within a defined 
temporal window™) in midguts of control females and females in which 
tra downregulation has been confined to adult progenitors. 15 days after 
induction, the size (assessed as the percentage of GFP-positive area) of 
control clones is significantly larger than that of tra-RNAi clones. The 
graph shows area quantifications for each genotype, and the confocal 
images show representative clones for each genotype. f, Consistent with 
the tra mutant clonal analysis in Fig. 2c, quantifications of clone size 
(number of cells per clone) reveal that MARCM clones in which tra has 
been downregulated are smaller than control clones only in females. 
Their size is comparable to that of wild-type or tra-downregulated 
mutant clones in males. The confocal images show representative clones 
(labelled in green with GFP) for each genotype in females. g, qRT- 

PCR quantifications of relative abundance of tra’, dsx", dsxM and Yp1 
transcripts in adult-specific tra2 mutants (tra2®/*! grown at permissive 
temperatures, then switched to the restrictive temperature 4 days after 
eclosion and transcriptionally profiled following 10 additional days at the 
restrictive temperature) and controls (tra2®/+). In tra2 mutant females, 
dsx" is lost, dsx is upregulated to levels comparable to those of control 
males and Yp1 (a Dsx? target) is lost (to levels also comparable to those 
of males). tra mutants (traD/?))sJ7/K) were also used as a positive control. 
n denotes the number of guts (a, ¢, d, e) or clones (b, f) that were analysed 
for each genotype. Results were combined from at least two independent 
experiments. See Supplementary Information for full genotypes. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a Regeneration b Regeneration Cc Homeostasis 
= x | eg g ro 
= = & 8s a; € control MARCM dsx? MARCM 
x bs Nes, ea oD OD 
2) o A A A ee 
3 A 3 rN s A g dh s ¢ x A x ma Female Female 
B RvR B&B Ewrku BF Le Ru 
5 PS PE § BS RE 5 Bh Bh PS s i 
6 €O 6B GS GO BB 8 68 3 os a6 = 
+50 AS +50 = 4 a 
fo.) — a Nga a 
240 E40 ° ey 
£ mm 
230 230 52 O 
8 a ro) 
+ 20 + 20 0 
2 L 
10 5.10 
2 0 2 0 
i o d Regeneration 
60 60 C= = x 
3 3 5 5 §& & 
245 pas 6 0 6 8 control MARCM dsx1 MARCM 
& = ner 
7 30 7 30 6 Female Female 
15 145 35 
oO 10) @ 
Ro XQ rt a 
ANMNNNNNNNN NOW DNNNNNNN ND 53 a 
79) op) a 1) 7p) 1) 7p) 1p) 2) Q 1 
a a a ao a a a a a a 2 a 
Male Female ‘ar 8 1 6 
control control |lesg™S>traRNAi Der2 0 
Male Female ____Female———CWFemale 
a a 
a - 
= a e Regeneration 
re we 5 . 
S a} s f= gs AE 
AJ €) gee £ FB 
3 a iS a S a 
° 7 8 ¢& 8 @G& control | [| esg?S>fruRNAi | 
- 550 eee | [eset] 
lesg?S>dsxRNAi, Der] esg™S>dsxF esg!S>traRNAidsxF]|  esg™S>dsxF 2 
Female Male Female Female £40 x 
230 < 
oO (at) 
a a . a" 
< < +, 20 ry 
Oo [a) r oO 
a Py 210 ee 
aw aw fe} Q 
8 : 20 : 
i) nan 
a $e) 
o ao 
3 3 a fa} 
Male Female 
f Regeneration g Regeneration 
¢ 3 S 4 dsxA dsx4 dsx4 dsx4 dsx4 dsx4 
+ ti NY NG + + x NG + dsx1 Df(3R)dsx11 dsxD dsx1 dsx1 
a = re = rn = i = a aie Disa antes Bias) ci saa Df(3R)Exe!6179 DF(3R)Exel6179 
& é é &é é é & é fruAtra fruP1.LexA 
4 7 60 rm rs ro ae ro a 
DoD D 
E 45 
E80 £ 
2 3 
@ 60 830 sel|o eclo ox|o otlo co olo oo}/o 
+, 40 Ss TVa Twlw now] i Tuli Tula Tuli 
= 15 cc} c © cc) ce cecyec ec c]<c¢ ccle ec|c 
3.20 a 
° ° 
Zo =0 
nn Ww DANNNNDNNN NN NN NN NAH DN NnNN NN NN 
0) 2) a o a 17) 7p) 179) a a no ap) 179) 7) 
a [ay a a a (a) a a {a} (a) a a ral a 
Male Female M F M iz M F M F M F M F 


Extended Data Figure 6 | dsx- and fru™-independent control of 
sexually dimorphic proliferation in adult intestinal stem cells. 

a, Adult-restricted downregulation of dsx (achieved by co-expression of a 
dsx-RNAi transgene and Dicer-2 (Dcr-2) in ISCs/EBs) has no effect on the 
compensatory ISC proliferation observed upon DSS treatment in either 
males or females. dsx" expression in the same conditions does not increase 
ISC proliferation in either males or females. b, dsx" expression does not 
rescue the reduced proliferation resulting from tra downregulation in 
males. Representative images for each genotype are shown in both a and 

b (DNA: DAPI, in blue; ISC/EB marker: GFP, in green). c, The size of dsx 
null mutant (dsx') MARCM clones (quantified in arbitrary units of GFP 
fluorescence as described in Methods) is comparable to that of controls in 
both sexes 15 days after clone induction by heat shock. Confocal images 
show representative clones (labelled in green with GFP) for each genotype. 
d, Quantifications of clone size in control and dsx null mutant (dsx’) 
MARCM clones in the midguts of DSS-treated males and females. 5 days 
after clone induction by heat shock, there are no significant differences in 
clone size (quantified in arbitrary units of GFP fluorescence as described 
in Methods) between control and mutants clones in either males or 
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females. Confocal images show representative clones (labelled in green 
with GFP) for each genotype. e, An RNAi transgene against fru does not 
reduce the number of pH3-positive cells in DSS-treated midguts when 
expressed from esg!S in the adult ISCs/EBs of either males or females. 
Confocal images show that number of intestinal progenitors (esg-positive 
cells in green) is also unaffected by this manipulation. f, Quantifications 
of the number of pH3-positive cells upon DSS treatment indicates that 
the sexual dimorphism in ISC proliferation is unaffected in females 

with forced fru™ expression (fruM/fru*”) or in males with forced fru" 
expression (fru'/fru*“). g, ISC proliferation is unaffected in the migduts 
of DSS-treated males and females entirely lacking dsx (dsx“/dsx'), or 
producing only Dsx® (dsx4/dsx!) or DsxM (dsx“/dsx?). ISC proliferation 
is also unaffected in dsx, fru double null mutant males and females 
(dsx+, Df BR)PXIOI79 dex! fru? 1.LexA) and in dsx null mutants in which fruM 
is ectopically produced in females (dsx+, Df(3R)*"!”?/dsx!, fru”). 

n denotes the number of guts (a, b, e, f, g) or clones (c, d) that were 
analysed for each genotype. Results were combined from at least two 
independent experiments. See Supplementary Information for full 
genotypes. 
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Extended Data Figure 7 | tra targets in adult ISCs. a, Scatter plot of all 
1,346 genes with tra-dependent expression in the adult fly midgut. For 
each gene, control female/tra null mutant female (Df(3L)"77/tra®°) fold 
differences in transcript abundance (x axis, log) scale) are plotted against 
tra mutant female with feminized ISCs (adult-restricted rescue of tra’ in 
ISCs/EBs)/tra mutant female fold differences (y axis, log, scale). Genes 
with tra-sensitive expression and significantly repressed by tra” in ISC/EBs 
(P < 0.05 cutoff) are therefore found in the left-bottom quadrant and are 
displayed in blue, whereas those significantly activated by tra" are found in 
the top-right quadrant and are displayed in red. Genes with tra-dependent 
transcription, independent of the action of tra” in intestinal precursors are 
displayed in black. b, Comparable analysis of tra-dependent alternative 
splicing. c, tra expression in adult ISCs affects splicing of 38 transcripts 
by at least 5 different mechanisms. The outcome of each of the alternative 
splicing mechanisms is shown in yellow for a representative gene. 
d, Adult-restricted downregulation (RNAi lines) or misexpression (UAS 
lines) of tra targets in adult ISCs/EBs by means of esg-Gal4, tubGal30"°. 


Genes normally repressed in female progenitors in a tra’-dependent 
manner were downregulated in males (top row) and/or misexpressed 
in females (bottom row). Genes upregulated in female progenitors in a 
tra’-dependent manner were downregulated in females (bottom row). 
Adult-restricted downregulation of Idgfl and Spn88Eb reduces the number 
of mitoses (pH3-positive cells) in DSS-treated females. Conversely, rdo 
misexpression inhibits DSS-induced ISC proliferation in females. Adult- 
restricted downregulation of other tra‘ targets in the same conditions does 
not affect ISC proliferation in either males or females. e, Male controls 
for Fig. 3c. In contrast to their effects on females, adult ISC/EB-restricted 
misexpression of rdo or downregulation of Idgf1 and Spn88Eb does not 
reduce the percentage of midgut area covered by esg-positive cells in 
DSS-treated males (DNA: DAPI, in blue; ISC/EB marker: GFP, in green). 
n denotes the number of guts (d) that were analysed for each genotype. 
Results were combined from at least two independent experiments. 
See Supplementary Information for full genotypes. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a Homeostasis 


b Homeostasis Cc Homeostasis d_ Homeostasis 
[__control__]_esg7S>traRNAI 
= = = = Male Female Male Female = = = = = = 
<x <x <x <= <= <= 
2 2 2 2 2 2 a = = = 
a ir a a 20d e iv xr & xr & 
£ £ £ g £ £ a) rr 5 rr 
Ash sdsfd Ash sfésd sé sd 
EE ££ £e Se K RK fe £e fe £e 
& 3D) 1S Oy ey 1S Sy D D cCcDOL DD CDC D 
0 AH QOHR GHA GOH Q Q 0 4 OH of OG 
OO: 2 Oe 2D 22 fae SL 28) 2! 22) 22 22: 29) 2 
es Oe ee = nike —nst— 
100 wii 5 % -$1004|- “Ta 
s - 2 
2 80 3. 80 
; 60 25 60 
Oi omy EN oO 
@ G40 2% 40 
‘6 = 20 g = 20 
2 = 0 — kkk —tkkrk— 7 = 0 
5 2 nee 100 OO etek 
= pice a g 
ge 80 rat = 80 
“5 60 a 560 
g iL 5 a 
= 40 R s 40 
S 20 a 220 
£0 & 0 = 0 
MF MF OS F MF 
3 days 20 days = 20 days 20 days 
@ Reproduction f Reproduction 
<x <x 
= 2 N 
a i 8 
2 AS os control 
i) a re) 3 re) $ BS virgin Female 
eRER ae = 
6 2 6 oO oN o 
— 7 ——— iN = 
90 3 a 
100 2 i 
a) O 
we 75 ola tf s N 
Oo QIZ n cS) 
50 wea oO pe 
ecgc > 
° 60 
= a8 o | esg!S>nintra 
0 245 virgin Female 
E = 
2 a 
300 7) 
230 S A 
225 ® s a 
Pal re 
1150 45 oy Oo 
= 75 > N 
7 § 
0 0 
vF mF mF 
3 days 


Extended Data Figure 8 | Effects of the sexual identity of adult ISCs on 
midgut size and reproductive plasticity. a, The number of cells in the 
R3a-b and R4a midgut regions, defined by expression of Cut and MvIN??375 
respectively (as described in ref. 86), is higher in females, and can be 
significantly reduced in females to numbers comparable to those found in 
males after 20 (but not 3) days of adult-specific downregulation of 

tra in intestinal progenitors (achieved by esg!’-driven tra downregulation 
initiated after the phase of midgut post-eclosion growth, see Methods 

for details). No effects are apparent following downregulation in males. 
Representative images of these midgut regions (labelled in red with Cut or 
in green with MvIN?2575 esg"S_driven GFP) are shown to the right for each 
genotype. b, Adult ISC/EB-specific tra downregulation does not affect 

the number of ISCs (esg-positive, Su(H)-negative cells) in either males 

or females, but strongly reduces EB (esg-positive, Su(H)-positive cells) 
production in females. c, d, Quantifications as in a and b for midguts with 
adult-specific downregulation of Sx/ in intestinal progenitors. c, Reduced 
number of cells in the R4a midgut region (top graph) and total midgut 
length (bottom graph) in female flies following 20 days of adult-specific 
and cell-autonomous masculinization of their intestinal progenitors 
(achieved by-down regulation of Sxl over 20 days with esg-Gal4). The same 
manipulation has no discernible effects in males. d, The same genetic 
manipulation does not affect the number of ISCs (esg-positive, 


Su(H)-negative cells) in either males or females, but strongly reduces 

EB (esg-positive, Su(H)-positive cells) production in females. e, The 
number of EBs (esg- and Su(H)-positive cells, bottom graph), but not 
ISCs (esg-positive only cells, top graph) is higher in control female flies 
3 days after mating. Adult ISC/EB-specific tra downregulation abrogates 
the postmating increase in EBs in females without affecting EB number 
in males, or ISC number in males or females. f, Adult-specific Sxl 
downregulation in intestinal progenitors leads to a small, but significant, 
reduction in egg production. An unrelated manipulation that also reduces 
ISC proliferation by inducing differentiation of ISCs (esg’’ >Notchi""”*, 
images to the right of the graph and ref. 87) also results in reduced egg 
production, whereas downregulation of Dsx (which does not control 

sex differences in progenitor proliferation) has no such effect. It should, 
however, be noted that esg-Gal4 is expressed in a subset of cells in the 
ovary niche*!. Hence, the possibility that these cells contribute to the 
observed phenotype cannot entirely be ruled out. Images to the right 
show loss of intestinal progenitor cell makers esg-Gal4 and Su(H)-LacZ 
following expression of Notch’ in adult intestinal progenitors, indicative 
of loss of progenitor identity. n denotes the number of guts (a, c), ISCs/ 
EBs (b, d, e) or female flies (f) that were analysed for each genotype. 
Results were combined from at least two independent experiments. See 
Supplementary Information for full genotypes. 
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Extended Data Figure 9 | See next page for figure caption. 
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Extended Data Figure 9 | Effects of the sexual identity of adult ISCs 
on the susceptibility to genetically induced tumours. a, The number of 
mitoses in Apc-ras mutant clones is larger than that of control clones in 
both sexes, but it is higher and dependent on Sx/ in females. b, The size 
of Delta (Dl, the Notch ligand) null mutant (DI®*’"!) MARCM clones is 
larger than that of control clones in both sexes, but female mutant clones 
are larger than male mutant clones. The graph shows quantifications 

of the number of cells within each clone 15 days after clone induction 

by heat shock, and the confocal images show representative clones 
(labelled in green with GFP) for each genotype. ¢, In tra ‘female’ mutant 
flies (tra®° and tra®°/Df(3L)"”) reduced Notch signalling in intestinal 
progenitors fails to induce the hyperplasia (quantified by the number of 
pH3 cells) normally observed in control females. d, Following 15 days 

of adult-specific downregulation of Notch (N) in intestinal progenitors, 
hyperplasia (quantified by the number of pH3-positive cells and also 
shown in representative images) is observed in female, but not male 
midguts. Adult-specific and cell-autonomous reversal of ISC/EB female 
identity—achieved by esg’’-driven downregulation of Sxl—fully prevents 
the hyperplasia induced by Notch downregulation in females, but has no 
effects on males. Confocal images show intestinal progenitor coverage of 
representative midgut portions for each genotype (DNA: DAPI, in blue; 


ISC/EB marker: GFP, in green). e, pH3 quantifications show a comparable 
effect for independent RNAi transgenes against Sx! and Notch. f, Adult- 
specific downregulation of Notch (N) signalling by ectopic expression 

of the downstream Notch signalling antagonist Hairless (H)** leads to 
hyperplasia (quantified by the number of pH3-positive cells and also 
shown in representative images) in female, but not male midguts. Adult- 
specific and cell-autonomous reversal of ISC/EB female identity—achieved 
by esg’-driven downregulation of Sxl—fully prevents the hyperplasia 
induced by Hairless overexpression in females, but has no effects on males. 
Confocal images show intestinal progenitor coverage of representative 
midgut portions for each genotype (DNA: DAPI, in blue; ISC/EB marker: 
GFP, in green). g, The number of pH3-positive cells 15 days after Notch 
downregulation in adult intestinal progenitors of double null mutant flies 
lacking dsx and fru (dsx4, Df(3R)P"!7°/dsx!, fru?!4°*4) is comparable 

to that of controls in both males and female flies. Like control flies, it 

is significantly higher in female flies. Virgin flies were used in all these 
experiments. 1 denotes the number of guts (a, c, d, e, f, g) or clones (b) 
that were analysed for each genotype. Results were combined from at least 
two independent experiments. See Supplementary Information for full 
genotypes. 
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Effector T-cell trafficking between the 
leptomeninges and the cerebrospinal fluid 


Christian Schlager!*, Henrike Korner!*, Martin Krueger’, Stefano Vidoli*, Michael Haberl!, Dorothee Mielke?, Elke Brylla°, 
Thomas Issekutz°, Carlos Cabafias’, Peter J. Nelson®, Tjalf Ziemssen®, Veit Rohde®, Ingo Bechmann?, Dmitri Lodygin', 
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In multiple sclerosis, brain-reactive T cells invade the central 
nervous system (CNS) and induce a self-destructive inflammatory 
process. T-cell infiltrates are not only found within the parenchyma 
and the meninges, but also in the cerebrospinal fluid (CSF) that 
bathes the entire CNS tissue’”. How the T cells reach the CSF, 
their functionality, and whether they traffic between the CSF 
and other CNS compartments remains hypothetical*®. Here we 
show that effector T cells enter the CSF from the leptomeninges 
during Lewis rat experimental autoimmune encephalomyelitis 
(EAE), a model of multiple sclerosis. While moving through the 
three-dimensional leptomeningeal network of collagen fibres in a 
random Brownian walk, T cells were flushed from the surface by 
the flow of the CSF. The detached cells displayed significantly lower 
activation levels compared to T cells from the leptomeninges and 
CNS parenchyma. However, they did not represent a specialized 
non-pathogenic cellular sub-fraction, as their gene expression 
profile strongly resembled that of tissue-derived T cells and they 
fully retained their encephalitogenic potential. T-cell detachment 
from the leptomeninges was counteracted by integrins VLA-4 and 
LFA-1 binding to their respective ligands produced by resident 
macrophages. Chemokine signalling via CCR5/CXCR3 and 
antigenic stimulation of T cells in contact with the leptomeningeal 
macrophages enforced their adhesiveness. T cells floating in the 
CSF were able to reattach to the leptomeninges through steps 
reminiscent of vascular adhesion in CNS blood vessels, and invade 
the parenchyma. The molecular/cellular conditions for T-cell 
reattachment were the same as the requirements for detachment 
from the leptomeningeal milieu. Our data indicate that the 
leptomeninges represent a checkpoint at which activated T cells 
are licensed to enter the CNS parenchyma and non-activated T cells 
are preferentially released into the CSF, from where they can reach 
areas of antigen availability and tissue damage. 

T cells specific for myelin basic protein (MBP) retrovirally trans- 
duced to express fluorescent proteins (Typp cells) were tracked on their 
way into the CNS tissues after intravenous (i.v.) transfer to healthy 
recipient rats. The Tyypp cells arrived at the CNS on the level of the 
leptomeninges of the spinal cord before they entered the parenchyma 
with the onset of clinical disease”. Concomitantly to their arrival in the 
leptomeningeal milieu, Tyypp cells accumulated in the CSF (Extended 
Data Fig. 1a, b). 

It has been proposed that immune cells, for example, T helper 
17 (Ty17) cells and M2 macrophages, enter the CSF from either the 
leptomeninges or the choroid plexus before spreading to the CNS 
tissues>°*?, We could not detect Tyypp cells in the choroid vessels or 
stroma before or during their accumulation in the CSF when we used 


intravital two-photon laser scanning microscopy (TPLSM) to image 
the choroid plexus of the fourth ventricle. Very few cells appeared in 
the choroid plexus at late phases of CNS infiltration, when Tyypp cells 
had already maximally accumulated in the leptomeninges and CSF 
(Fig. 1a). Serial fluorescence microscopy of all four choroid plexus, 
immuno-electron microscopy and cytofluorometric quantifications 
revealed similar results (Extended Data Fig. 1c-f). To exclude the 
possibility of transport of Tyypp cells from the choroid plexus to the 
leptomeninges of the spinal cord, we injected a nontoxic polymer- 
izing agent (Matrigel) into the cisterna magna, that is, between the 
ventricular egression points and subarachnoidal space of the spinal 
cord®. The cisternal block neither prevented the invasion of Typp 
cells into the subjacent spinal cord nor influenced the development 
of clinical disease (Extended Data Fig. 2a, b). Similarly, interrupt- 
ing the CSF flow from the choroid plexus to the spinal cord did not 
hinder the accumulation of Tyypp cells in the lumbar/thoracic spinal 
cord (Extended Data Fig. 2c-e). 

Tmpp cells crawling within the leptomeningeal vessels transgressed 
the vascular walls (Extended Data Fig. 3a) and moved within the 
leptomeningeal environment (Supplementary Video 1). This milieu 
is highly specialized: the pial vessels that run along the surface of the 
spinal cord parenchyma are suspended in a dense three-dimensional 
(3D) extracellular matrix (ECM) network consisting mainly of col- 
lagen fibrils’ (Extended Data Fig. 3b). Just as the bloodstream flows 
over the vascular endothelia, the CSF within the subarachnoidal space 
flows over the pial ECM network. Interestingly, when following the 
movement of single migrating Typp cells, we regularly observed them 
at the interface of the ECM network and fluidic compartment becom- 
ing detached and being washed into the CSF (Fig. 1b, Extended Data 
Fig. 3c, Supplementary Videos 2 and 3). This detachment was verified 
by imaging in situ labelled T cells expressing the photo-convertible 
fluorescent protein Dendra2 (Extended Data Fig. 3d, e, Supplementary 
Video 4). Photo-converted Tpp-pendra2 Cells in the leptomeninges 
steadily decreased over time and were replaced by non-photo-converted 
Tpp-Dendra2 Cells, indicating rapid turnover of T cells in this CNS com- 
partment (Extended Data Fig. 3e). 

RNA sequencing (RNA-seq) revealed that T pp cells from the blood, 
CSE, leptomeninges and spinal cord parenchyma displayed broad 
conformity of their gene expression profiles, with very similar 
expression levels of master transcription factors, chemokine recep- 
tors, adhesion molecules, cytokine receptors, cell motility and T-cell 
receptor genes (Fig. 2a, Extended Data Fig. 4a, b, Supplementary 
Table 1). The exceptions were genes of the T-cell activation program: 
Tmpp Cells from the leptomeninges and CNS parenchyma displayed 
strongly upregulated genes of cytokine-cytokine-receptor interaction, 
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Figure 1 | Typp cells enter the CSF from the leptomeninges. a, Intravital 
imaging of Typp cell entry into the leptomeninges and choroid plexus 
(CP) of the fourth ventricle. TPLSM recordings were performed during 
the pre-invasion, early and established phases of leptomeningeal T-cell 
infiltration (see Methods for the definition of phases) sequentially in 
thoracic spinal cord, medulla and choroid plexus of fourth ventricle in the 
same animal. Representative recordings of 16 independent experiments. 
Blue, Tpp-tifeact-Turquoise2 Cells; red, vessel lumen, meningeal macrophages; 
yellow and white arrows, intravascular and extravasated cells, respectively. 
b, In vivo visualization of Typp cell detachment from the leptomeninges 
into the CSF. Intravital images of leptomeninges during the established 
phase of leptomeningeal T-cell infiltration showing a Typp_crp cell 

(false red colour) detaching from the pial surface. Yellow lines, crawling 
steps of a Tapp-crp cell; magenta line, migratory path of a T cell rolling 

on the leptomeninges and being dragged away into the CSF; grey (false 
colour), vessel lumen, meningeal macrophages; red arrows, T\ypp_cep cell 
detachment; orange arrow, CSF flow direction. Scale bars, 50 jm. 


JAK-STAT signalling and T-cell receptor signalling pathways, whereas 
CSF-derived T)ypp cells resembled blood-derived T cells that lacked 
virtually any signs of activation (Fig. 2b, Extended Data Fig. 4b-e, 
Supplementary Tables 2 and 3). TPLSM recording of the nuclear trans- 
location of a fluorescently labelled nuclear factor of activated T cells 
(NFAT-YEP) sensor!” confirmed these findings: in contrast to the 
leptomeninges and the CNS parenchyma where 26% and 35%, respec- 
tively, of the Typp_nratvyep Cells showed nuclear translocations of the 
sensor, virtually no Typp-_neatyyep Cells in the CSF displayed nuclear 
NFAT translocations (< 2%; Fig. 2c). 

The low activation profile of CSF-derived Tmpp cells could not 
be explained by inhibitory effects of the CSF on T-cell activation 
(Extended Data Fig. 4f). Most notably, CSF-derived Tyypp cells were 
fully responsive to antigenic stimulation and induced ‘classic EAE 
(Fig. 2d), indicating that T cells in the CSF—despite their reduced 
activation levels—were fully functional. 

We next performed i.v. transfer of myelin oligodendrocyte glyco- 
protein (MOG)-reactive T cells (Tog cells), which enter the CNS 
tissues but reach only low reactivation levels within the CNS and 
induce very mild clinical disease!! (Extended Data Fig. 5a). Toc cells 
invaded the CNS parenchyma to a lesser extent than Tyypp cells but 
remained in higher percentages in the CSF and leptomeninges (Fig. 2e 
and Extended Data Fig. 5b). This shifted distribution pattern became 
even more pronounced when brain non-reactive ovalbumin-specific 
T cells (Tova cells) were transferred into rats. They entered in the CNS 
milieu similarly to Typp or Tog cells, though in much lower numbers 
(Fig. 2e), but mostly remained in the CSF/leptomeninges compart- 
ments (Fig. 2e and Extended Data Fig. 5b) and did not evoke any 
detectable CNS inflammation (Extended Data Fig. 5c). 

The situation markedly changed in inflamed CNS tissue. When 
Tova cells were co-transferred with pathogenic Typp cells that induced 
a strong upregulation of chemokines, integrin ligands and cytokines 
within the leptomeninges and parenchyma, high numbers of Tova 
cells, irrespective of their low activation levels, invaded the CNS and 
migrated from the CSF/leptomeninges compartments deep into the 
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CNS parenchyma (Fig. 2e and Extended Data Fig. 5b-d). There were 
still more Toya cells in the CSF than Typp cells, indicating that even 
during CNS inflammation the lack of T-cell reactivation affects the 
adhesiveness of Toya cells within the leptomeningeal milieu (Extended 
Data Fig. 5e). Taken together, these data indicate that in addition to the 
T-cell reactivation state, inflamed CNS tissue supports T-cell attach- 
ment to the leptomeninges and migration into the CNS parenchyma. 

We next investigated the cellular partners and molecular cues that 
guide T cells within the leptomeningeal milieu. Enmeshed in the pial 
ECM are numerous macrophages that scan their environment by pro- 
truding and retracting cellular processes (Extended Data Fig. 6a). Typ 
cells crawling through the leptomeningeal milieu were in contact with 
these cells the majority of the time (> 70%) (Extended Data Fig. 6a, b, 
Supplementary Video 5). In contrast to T cells in lymphatic tissue!”, 
the Typp cells in the leptomeninges moved in a Brownian random 
walk (Fig. 3a, b) not following measurable microgradients (Extended 
Data Fig. 6c). CD8 * T cells searching for rare infectious foci during 
toxoplasma infection of the CNS displayed a Lévy walk behaviour 
that is characterized by a combination of random walk and large 
jumps!3. Apparently when the interacting partners within the CNS 
tissue are abundant, as is the case with leptomeningeal macrophages, 
Brownian motion is sufficiently efficient’. Meningeal macrophages 
express major histocompatibility complex (MHC) class II molecules 
on their surface and therefore can act as antigen-presenting cells'®!>"'®, 
In addition, they express fibronectin and ICAM-1 (Extended Data 
Fig. 6d), that is, ligands of VLA-4 (481) and LFA-1 (aL82) integrins 
expressed by Tyypp cells (Extended Data Fig. 7a). In situ staining with 
an antibody directed against the activated conformation of VLA-4 
revealed that 10% of the Tmpp cells within the leptomeninges, but none 
in the CSF, carried activated integrin on their membrane (Extended 
Data Fig. 7b). Intrathecal (i.t.) injection of neutralizing anti- VLA-4 
and/or anti-LFA-1 monoclonal antibodies shortened the contact time 
of Tp cells with leptomeningeal phagocytes, whereas the contact 
frequency increased (Fig. 3c). Furthermore, in the presence of the 
CSF flow, the velocity of the T cells increased (Fig. 3c, Extended Data 
Fig. 7c, d), but their random walk behaviour remained unaffected 
(Extended Data Fig. 7e). Interestingly, we noted a large leptomeningeal 
detachment of Typp cells that then rolled along the pial surface or 
floated in the CSF (Fig. 3d, e, Extended Data Fig. 7f, Supplementary 
Video 6). Accordingly, photo-converted Typp_Dendra2 Cells vanished 
more rapidly from the leptomeninges after VLA-4/LFA-1 interference 
(Extended Data Fig. 7g). 

Chemokines are known to activate the integrin-mediated bind- 
ing of immune cells!”. Typp cells upregulate chemokine receptors 
before entering the CNS, in particular CCR5, CXCR3 and CXCR4 
(ref. 18; Extended Data Fig. 8a). The corresponding chemokines 
(CCL5, CXCL9-11 and CXCL12) were present in the CNS tissues 
and specifically expressed by the resident macrophages (Extended 
Data Fig. 8b-d). After it. application of pertussis toxin (PTX; which 
acts as a global chemokine inhibitor), high numbers of Typp cells 
became detached from the pial surface and were dragged into the 
CSF (Extended Data Fig. 8e, Supplementary Video 7). Their random 
walk behaviour and straightness of movement were unchanged but 
their contact time with the resident macrophages decreased (Extended 
Data Fig. 8f, g). In contrast to integrin blockade, the T-cell velocity 
remained unchanged (Extended Data Fig. 8f). Similar observations 
were made after i.t. injection of Met-RANTES and anti-CXCR3 mono- 
clonal antibodies, which specifically interfere with CCR5 and CXCR3, 
respectively. In contrast, the CKCR4 blocker AMD3100 had no effect 
(Fig. 3f, Extended Data Fig. 8f, h). 

Tova cells also migrated in a Brownian locomotion pattern and 
came into contact with the local macrophages, but these contacts were 
less frequent and shorter than those of Typp cells and their velocity 
was higher (Extended Data Fig. 9a—c). Interfering with integrins or 
chemokines also caused a strong detachment of Tova cells from the 
leptomeninges (Extended Data Fig. 9d, e). Notably, neither integrin 
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Figure 2 | Characterization of Typp cells in the CSF. a, Similarities in 
transcriptome profiles between Typp cells in blood, CSF, leptomeninges 
and CNS parenchyma. RNA-seq was performed for Typp-_crp cells sorted 
from the indicated compartments 3 days after transfer, and relative (rel.) 
mRNA expression of selected adhesion molecules and chemokine receptor 
genes is shown. b, Genes related to T-cell activation are upregulated 

in Typp cells from meninges and CNS parenchyma compared to CSF. 
Relative mRNA expression (RNA-seq analyses) of selected genes that are 
highly regulated in Ty\ypp_crp cells from the indicated CNS compartments. 
Expression levels in the T cells isolated from blood was set to 1 (a, b). 
House-keeping genes, Actb and Cd3e. ¢, In vivo analyses of Typp_near Cells 
confirm that their activation levels within the leptomeninges exceed those 
in the CSF. Percentages of cells with either cytoplasmic (not activated) 

or nuclear (activated) NFAT sensor during the established phase 

of leptomeningeal T-cell infiltration were determined by TPLSM 

(218 cells from 5 independent experiments, left plot) or by histology of 
SC cryosections (white matter (WM): 352 cells, meninges: 1,371 cells) and 
of cytospin preparations (CSF: 326 cells), right plot. Data are mean + s.d. 
d, Tupp cells in the CSF retain their pathogenic potential. 1.7 x 10° 

in vitro reactivated Typp_crp Cells isolated from CSF of animals 3 days 
after transfer or an equivalent number of in vitro T cell blasts were injected 
iv. into naive animals. Representative clinical score of 2 independent 
experiments (n= 6). The moderate severity of clinical disease is due to 

the low amount of injected T cells. e, T-cell reactivation and inflammatory 
state of the CNS tissue determine T-cell trafficking from leptomeninges 
into the CSF or CNS parenchyma. Flow cytometry quantification of 
antigen-specific T cells in the CNS compartments at the indicated time 
points after transfer. The first three plots show Tupp-_crp cells, Taoc-crp 
cells and Toya_crp cells with high, low or no reactivation potential, 
respectively. Right, Toya_crp cell distribution in an inflamed CNS tissue 
(co-transfer with non-labelled Typ cells). Data are mean £s.e.m. of at 
least 6 animals per group per time point for each antigen specificity from 
at least 2 independent experiments. 


nor chemokine interference influenced the basal locomotion char- 
acteristics or the Brownian motility pattern of Toya or Typp cells 
(Extended Data Figs 8f, g, and 9f-h). These findings indicate that 
effector T cells in the 3D network of the leptomeninges—similarly 
to dendritic cells in a 3D environment!?—are integrin-independent 
and follow an intrinsic locomotion program independent of chemok- 
inetic or chemoattractive stimuli. In contrast, the interactions with 
local antigen-presenting cells are regulated by chemokine- and 
integrin-mediated adhesive forces that prevent effector T cells from 
being released into the CSF from the flow-exposed 2D milieu of the 
leptomeninges. 

T cells were previously proposed to migrate from the CNS tissues via 
the CSF along lymphatic vessels of the dura mater”, but it is unclear 
whether T cells in the CSF also return to the leptomeninges and the 
CNS parenchyma. After it. injection, Typp cells were indeed spread 
within the leptomeninges and the adjacent parenchyma (Extended 
Data Fig. 10a). This infiltration pattern resembled that of early EAE 
lesions but clearly differed from that of solutes?”, which were pro- 
posed to be transported from the CSF into the CNS parenchyma via 
peristaltic forces along periarterial spaces. This ‘glymphatic’ transport 
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Figure 3 | Integrins and chemokines mediate Typp cell adhesion to 
leptomeningeal structures. a, b, TPLSM recordings show the motility 
behaviour of Typp-cerp cells in the leptomeningeal milieu. Time interval, 
32 s. a, Tyypp cells do not follow a preferential direction. Directions 

(left) and associated probability for 0-360° angles (right). b, Typp cells 
move in a Brownian random walk. Left, probability distribution of 

T-cell displacements in a linear scale with overlapping fitting of the 
displacement norm ({u;|) for Lévy (blue), Weibull (brown) and normal 
(Gaussian, orange) distributions. Respective P values are indicated. 

Only the Weibull distribution (equivalent to Brownian movement) 

has statistical significance. Right, probability plots of the indicated 
distributions against the displacement norm data with the associated 
results of Kolmogorov-Smirnov (KS) test. Analysis of at least 

9 TPLSM recordings. c, Integrin blockade interferes with T cell and 
antigen-presenting cell contact and accelerates T-cell migration. Intravital 
30 min TPLSM recordings. Mean contact duration (529 contacts per 

208 cells), contact frequency (208 cells) between Tupp_cep cells and 
leptomeningeal phagocytes and velocity (320 cells) before (control) and 

4h after it. injection of LFA-1/VLA-4-blocking monoclonal antibodies. 
d, e, Integrin blockade induces a release of Typ cells from the meninges 
into the CSF and increases T-cell rolling. d, Flow cytometry quantification 
of Tpp-cep Cells in the different CNS compartments 4 h after i.t. treatment 
with PBS (control) or blocking antibodies. Data are mean +s.e.m. of 

3 independent experiments (c, d) including three animals per group 

(d) (two-tailed Mann-Whitney U-test). e, Number of Typp_cep cells 
rolling on the pial surface during 10 min TPLSM recordings before (0 h) 
and at the indicated time points after integrin blockade. f, Interference 
with chemokine signalling induces a release of Typp cells from the 
leptomeninges into the CSF. Flow cytometry quantification of Typp_crp 
cells in the different CNS compartments 4 h after it. treatment with PBS 
(control), anti-CXCR3 monoclonal antibody, Met-RANTES or AMD3100. 
Data are mean +s.e.m. of representative data from two (e) or three (f) 
independent experiments including three animals per group (Kruskal- 
Wallis ANOVA followed by Dunn’s multiple comparison test). Analyses 
performed during the established phase of leptomeningeal Typp cell 
infiltration (a-f). *P< 0.05, **P< 0.01, ***P < 0.001 (c-f). 


mechanism does not seem to apply to cellular components”*. We 
detected that T-cell transport in the CSF was mainly driven by the 
animal's respiration rather than the cardiac cycle** (Extended Data 
Fig. 10b, Supplementary Video 8). After injection of Tyypp cells into the 
subarachnoidal space of the cisterna magna or the lumbar spinal cord, 
the majority of cells accumulated at the levels of cell injections, that is, 
at the medulla/cervical and lumbar/thoracic spinal cord, respectively 
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Figure 4 | Reattachment of effector T cells from the CSF to the 
leptomeninges and entry into the CNS parenchyma. a, In vivo 
visualization of Tyypp cell reattachment from the CSF to the leptomeninges. 
TPLSM recordings (30 min) during the established phase of 
leptomeningeal Typp cell infiltration. Example of a Typp-_crp cell 

(false red colour) reattaching from the CSF to the pial surface with 
subsequent rolling (magenta dotted lines), capture (orange asterisks) and 
crawling steps (yellow dotted lines). Green, T\pp_crp Cells; grey (false 
colour), vessel lumen, meningeal macrophages; orange arrow, direction 

of CSF flow. Time interval, 32 s. Scale bar, 501m. b-d, Conditions that 
regulate Typ cell reattachment and entry into the CNS tissue. b, Blocking 
of integrin or Ga; signalling strongly reduces the capacity of Typp cells 

to migrate from the CSF into the CNS parenchyma. Typp-_crp cells 
isolated from the CNS parenchyma were pre-treated with anti- VLA-4 or 
anti-LFA-1 monoclonal antibodies or with isotype control IgG (control) 
(left), or PTX or PBS (control) (right). The cells were then injected it. 

into naive animals. Entry of the transferred T\pp-crp cells into the spinal 
cord (including meninges and parenchyma) and the CSF was quantified 
by flow cytometry 42h later, that is, when the re-transferred T cells had 
maximally infiltrated the spinal cord of the recipient animals. Mean + s.d. 
of representative results of 3 independent experiments (n = 18, Kruskal- 
Wallis ANOVA followed by Dunn's multiple comparison test (left) or 
two-tailed Mann-Whitney U-test, right). c, d, Activation fosters trafficking 
of Typp cells from the CSF into the CNS tissue. c, T\pp_cerp cells isolated 
from CSF or CNS parenchyma were injected i.t. into naive animals (n =6). 
d, CSF-derived Typp_crp cells with or without previous antigenic stimulation 
in vitro were transferred i.t. Cell quantifications as in b. Data are mean + s.d. 
of three (c) and two (d) independent experiments for each condition (n= 8, 
two-tailed Mann-Whitney U-test). **P< 0.01, ***P < 0.001 (b-d). 


(Extended Data Fig. 10c). However, a considerable number of cells also 
reached remote areas of the CNS, indicating that the CSF can serve 
as transport medium for the cells. These distributions clearly differ 
from that of regular EAE (Extended Data Fig. 10d), in agreement with 
our observation that during EAE effector T cells enter the meningeal 
milieu via local vessels. 

Tmpp Cells initially floating within the CSF rolled along the pial sur- 
face and then stopped; thereafter, they continued their locomotion 
by crawling or rolling along the leptomeningeal structures (Fig. 4a, 
Supplementary Videos 4, 9 and 10), reminiscent of vascular adhesion 
steps”°. The molecular requirements for the attachment of the T cells 
were also similar and included integrins and chemokines (Fig. 4b). 
However, the molecular range in leptomeningeal adhesion included 
LFA-1 in addition to VLA-4 (Fig. 4b), only the latter being a require- 
ment for T-cell adhesion to the CNS endothelium’. Furthermore, 
integrin blockage completely abolished T-cell rolling along the 
endothelium’, but not along the pial surface (Fig. 3e, Supplementary 
Video 6), indicating that during the leptomeningeal reattachment inte- 
grins control fixed adhesion of the T cells rather than their first initial 
binding steps. Another notable difference to vascular adhesion was the 
role of T-cell activation. CNS endothelial cells do not act as antigen- 
presenting cells for the incoming Typ cells!°. However, antigen-driven 
T-cell activation significantly contributed to the T-cell reattachment in 
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the leptomeninges: non-activated CSF-derived Tpp-crp cells entered 
the CNS tissue after i-t. injection less efficiently than their activated 
counterparts from the CNS parenchyma (Fig. 4c). When the CSF- 
derived T cells were activated before transfer, their invasive capacity 
significantly increased (Fig. 4d). Consequently, i-t.-transferred acti- 
vated Tyypp cells invaded the CNS parenchyma and induced clinical 
EAE, whereas the resting cells were non-pathogenic (Extended Data 
Fig. 10e). Pretreatment of the activated T cells with anti-integrin 
monoclonal antibodies significantly reduced the severity of clinical 
EAE (Extended Data Fig. 10f). Tova cells attached less efficiently than 
Topp cells. However, when the Tova cells were activated, after anti- 
genic stimulation before i.t. transfer or by evoking antigen-specific 
reactivation within the CNS by supplying OVA i.t.”°, they invaded the 
CNS more efficiently (Extended Data Fig. 10g). 

The reattachment to the leptomeninges was not only dependent on 
the activation state of the T cells but also on the inflammatory state of 
the CNS tissue, as after i.t. transfer, Tyypp cells preferentially invaded 
inflamed CNS tissue compared to that of control rats (Extended Data 
Fig. 10h). The enhanced invasion could be substantially reverted 
by pretreating the cells with integrin/chemokine blockers before i.t. 
transfer (Extended Data Fig. 10i). Both observations indicate that the 
molecular/cellular requirements for reattachment of T cells from the 
CSF to the leptomeningeal milieu were the same as for their adhesion. 

It is important to note that our findings were not an artefact of trans- 
fer EAE. In animals with active EAE induced by reactivated memory 
Topp cells?’, T cells exhibited a very similar trafficking behaviour and 
function in the CSF (Extended Data Fig. 10j-l). 

Our data support the view that the leptomeninges represent an 
important checkpoint for T-cell infiltration of the CNS during auto- 
immune inflammation or immune surveillance”’®”’. Trafficking 
between CSF and CNS tissue is well regulated by integrin adhesive 
forces that are triggered by T-cell activation and/or chemokines!””° 
(Extended Data Fig. 10m), indicating that the CSF fulfils a dual role: it 
functions as a depot that prevents potentially dangerous effector T cells 
from entering the CNS parenchyma if these cells had unsuccessfully 
scanned the meninges for antigens or damage; and the circulating CSF 
can be used by T cells for residence or as transport medium, similar 
to blood circulation, to rapidly travel to damaged areas of the CNS. 
T cells in the CSF displayed very similar expression profiles to effector 
T cells that had invaded the CNS tissue. Furthermore, they maintained 
full antigen responsiveness and pathogenic potential. Therefore, the 
characterization of readily accessible T cells in the CSF could be of 
relevance for gaining insights into the properties and function of path- 
ogenic T cells in multiple sclerosis. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Animals. Rats on a LEW/Crl background (Rattus norvegicus) were bred in the 
animal facilities of the University Medical Center, Gottingen (Germany) and 
held under standardized conditions. All animal experiments were approved by 
the responsible authorities (number: 209.1/211-2531-36/04 and 33.9.42502-04- 
016/09). Male and female animals between 6-12 weeks old were used in the EAE 
experiments. No differences were noted between the sexes. 

Generation and culturing of T cells. CD4* T cells reactive against myelin basic 
protein, recombinant myelin oligodendrocyte glycoprotein (amino acids 1-120) or 
ovalbumin were retrovirally engineered to express eGFP or mCherry (denoted as 
Twpp-cep Tova-ces Tmoc-crp and Tupp-cherry) a8 previously reported*®. The gener- 
ation of MSCV-NFAT/YFP-Cherry/H2B transduced T-cell lines (Tapp_-neatyyep) 
is described elsewhere’. For the generation of TMBP-Lifeact-Turquoise2 OF 
Tpp-Dendra2 Cell lines, the fragment coding for the fusion protein Lifeact- 
mTurquoise2 (Addgene, plasmid 36201) or for the photo-switchable pro- 
tein Dendra2*! were cloned into the MCS of the murine stem cell retrovirus 
pMSCVpuro (Invitrogen). The generation of primary effector T-cell lines has 
been previously described°’. Extraction of guinea pig MBP and production of 
recombinant MOG were performed as previously described***?. Ovalbumin 
(albumin from chicken egg white grade V) was obtained from Sigma. All T-cell 
lines were CD4*, CD8~ and a3TCR‘, they had an effector memory phenotype 
(L-selectin” CD45RC!"CD44"8") and upon stimulation produced IFNy and 
IL-17. Phenotype, cytokine profile, antigen specificity, pathogenicity and absence 
of mycoplasma contamination were verified in each cell line*®. 

EAE models. Adoptive transfer EAE was induced by iv. injection of 5 x 10° MBP- 
or MOG-reactive T-cell blasts (day 2 after antigen encounter). In some experiments 
2-5 x 10° ex vitro or 0.5-2 x 10° ex vivo isolated Typp cells were transferred into 
the cisterna magna. To address the role of CNS-non-reactive T cells in a non- 
inflammatory situation, 5 x 10° Tova-GEp/mCherry blasts were transferred as above. 
To investigate the behaviour of Tova cells in inflamed CNS, 2.5 x 10° Tova-cep cells 
were co-injected with 5 x 10° non-labelled Tyypp cells. For in vivo recording during 
the established phase of leptomeningeal T cell infiltration, 2 x 10° Tupp-cep blasts 
were co-injected with 3 x 10° non-labelled Tyypp cells. This protocol allowed the 
accurate tracking of individual effector T cells. 

Active EAE was induced in 8- to 12-week-old memory animals by subcutane- 
ous antigen immunization. Memory animals were established by intraperitoneal 
transfer of Typp_cep Cells into neonatal animals as previously described””. Weight 
and clinical scores were recorded daily (score 0, no disease; 1, flaccid tail; 2, gait 
disturbance; 3, complete hind limb paralysis; 4, tetraparesis; 5, death). 

No statistical method was used to predetermine sample size. The experiments 

were not randomized. The investigators were not blinded to allocation during 
experiments and outcome assessment. 
Cell isolation, flow cytometry and fluorescence-activated cell sorting. Isolation 
of fluorescently labelled T cells from the tissues has been previously described**. 
Briefly, mononuclear cells were isolated from EDTA-treated blood by density 
gradient. Immune cells were obtained from spinal cord meninges and parenchyma 
by a two-phase Percoll-density gradient. CSF was collected from the cisterna 
magna using a stereotactic device. The choroid plexus of the fourth and lateral 
ventricles was excised under a stereomicroscope by carefully pulling it off the wall 
of the ventricles along the tenia choroidea. Nevertheless, it cannot be excluded 
that adjacent meningeal tissue was excised together with the choroid plexus, thus 
leading to an overestimation of Typ cell numbers. Meningeal macrophages were 
labelled by injection of (3 kDa) Texas-Red-conjugated dextran in the cisterna 
magna (40 1g per rat) and isolated by density gradient 24h later. Labelled cell 
populations were sorted by a FACSAria 4L SORP cell sorter (Becton Dickinson). 
Flow cytometry analysis was performed with a FACSCalibur operated by Cell 
Quest software (Becton Dickinson). Surface staining was performed by using the 
following mouse anti-rat monoclonal antibodies: OX-40 antigen (CD134), OX-39 
antigen (CD25, IL-2 receptor a chain), CD11b/c (OX-42), aBTCR-AF647 (clone 
R73, BioLegend), TCR VB 8.2 (clone R78, Santa Cruz Biotechnology), TCR VB 8.5 
(clone B73), TCR VB 10 (clone G101), TCR V8 16.1 (clone His42), CD4-PECy7 
(clone OX-35, BD Biosciences), CD62L-PE (clone OX-85, Biolegend), CD45RC- 
PerCPCy5.5 (clone OX-22, Abcam). All antibodies were purchased from Serotec 
unless indicated otherwise. Mouse IgG1 (MOPC 31C, Sigma-Aldrich) served as 
isotype control; APC-labelled anti-mouse antibody (Dianova) was used as second- 
ary antibody. For detection of the activated conformation of integrin 81, HUTS-4 
antibodies*® (0.5 mg kg!) were injected i.t. After 4h, ex vivo isolated Twpp_crp 
cells were stained with the secondary antibody. For intracellular staining, mouse 
anti-rat IFNy antibodies (DB-1, Invitrogen) and rat anti-mouse IL-17-PE (BD) 
were used. Mouse IgG1k (MOPC 31 C) and rat IgG1-PE were used as control. 
Cytofluorometric quantification of T cells was performed by relating the number 
of cells to a known absolute amount of fluorescent beads. 


Intravital TPLSM. Intravital studies were performed during the following phases 
of leptomeningeal Typ cell infiltration: (1) pre-invasion, when Typp cells first 
appeared in leptomeningeal blood vessels but had not yet extravasated; (2) early, 
at the beginning of the leptomeningeal T-cell infiltration, that is, when Tyypp cells 
started to distribute within the leptomeninges but had not yet invaded the CNS 
parenchyma (the animals were still completely healthy); (3) established, when Tipp 
cells had entered the leptomeninges and started to invade the CNS parenchyma 
(the animals began to lose body weight); (4) disease, when Typp cells invaded the 
CNS parenchyma (the animals showed symptoms of paralytic disease). 

Surgical procedures. Animals were anaesthetized, tracheally intubated, ventilated 
and stabilized in a custom-made microscope stage; body temperature was regu- 
lated by a heated pad (37.5 °C). During imaging, vital parameters were registered 
as previously described’. Thoracic leptomeninges were accessed as previously 
described’ by performing a laminectomy at level Th12/L1. Brainstem area was 
exposed as described**. For recording the choroid plexus, a cerebellar interhemi- 
spheric approach was performed to access the fourth ventricle. The rat was placed 
in a prone position with the head flexed and fixed in a custom-made stereotactic 
frame. The skin was incised in the midline (length approximately 2 cm) in order 
to expose the posterior cranial vault and first cervical vertebra. The muscle was 
detached using regular small scissors and bipolar forceps. The atlanto-occipital 
membrane was exposed between the foramen magnum and the first cervical verte- 
bra. Afterwards, the skull was thinned out with a twist drill up to the inner cortical 
layer. This layer was then removed with a slightly curved forceps. Subsequently, the 
brainstem and the vermis were exposed and could be visualized through the dural 
layer. The dura was opened in the midline in a blunt fashion using bipolar forceps 
or a blunt micro-hooklet. The adjacent brain tissue was retracted to the side and 
partially reduced in size by bipolar coagulation and resection in a 30-45° angle 
upwards in the sagittal plane in order to enter the fourth ventricle. Upon ventricle 
entry, CSF and the basal part of the fourth ventricle plexus could be visualized. 
After each preparation fluorescently labelled dextran was injected i.v. to confirm 
that the choroid plexus vessels were patent. 

Labelling of phagocytic cells and blood vessels. Meningeal phagocytes were labelled 
by it. injection of 3 kDa Texas-Red-conjugated Dextran 48 h after transfer. Blood 
vessel lumen was visualized by i.v. infusion of (2000 kDa) TRITC-labelled dextran 
(200 1g) before or during TPLSM recordings. 

Technical equipment and processing of raw data. Time-lapse TPLSM was performed 
using a LSM710/Axio Examiner Z1 microscope (Carl-Zeiss Microimaging) com- 
bined with a > 2.5 Watt Ti:Sapphire Chameleon Vision II Laser device (Coherent 
GmbH). The excitation wavelength was tuned to 880nm or 1010nm and routed 
through a 20x water NA1.0 immersion objective W Plan Apochromat (Carl-Zeiss 
Microimaging). Typically, areas of 424.27 x 424.27 um (512 x 512 pixels) width 
were scanned and 50-100 j1m z-stacks were acquired. The interval time between 
sequential acquisitions was kept to 32 s. Emitted fluorescence was detected using 
non-descanned detectors (Carl-Zeiss Microimaging) equipped with 442/46 nm, 
483/32 nm, 525/50 nm, 550/49 nm, 607/70 nm and 624/40 nm band-pass filters 
(Semrock Inc.). Collagen was detected by two-photon generated second-harmonic 
signals. 

Irreversible photo-conversion of Typp_pendra2 Cells from green to red was 
achieved by irradiation of meningeal spots with UV light for 20 s. Recordings 
were performed at 880 nm using 483/32 nm and 550/49 nm band-pass filters. 
TPLSM recordings were acquired and processed by Zen 2009 Software (Carl-Zeiss 
Microimaging). The depth colour-coding of individual T cells in the z-plane was 
generated in Fiji software. 

Analysis of T-cell motility. Imaris software 7.1.1 (Bitplane) was used for 3D recon- 
structions and 4D analysis of acquired raw data. Cells were tracked using the auto- 
mated Imaris Track module with subsequent manual revision. Motility parameters 
including T-cell velocity (average of the instantaneous speed for each cell track), 
crawling duration and meandering index (ratio between total T-cell path length 
and the sum of the entire single displacements) were calculated as described”*” 
within a 30 min recording interval. Rolling T cells were defined as cells appearing 
as single or several round-shaped dots or cells moving solely in direction of the 
blood flow with >50 1m min’ instantaneous velocity. Owing to limitations in 
temporal resolution of the two-photon scanning/fluorescence video microscopy 
equipment used, fast rolling events might have escaped our analysis’. The motility 
parameters were analysed before and after each individual treatment. The value 
registered before the treatments did not show any difference between the groups 
and therefore they were pooled together and indicated in the legends as control. 
Statistical evaluations were performed with GraphPad 5.0.4 (GraphPad Software). 
Analysis of T-cell interactions with meningeal phagocytes/analysis of T-cell activation 
in vivo. Meningeal phagocytes were labelled as above. We exclusively analysed 
motile GFP * T cells and spots with similar density of fluorescently labelled phago- 
cytes. Contact durations were determined by manually counting the number of 
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frames during which individual motile GFP * T cells were in close vicinity (<1 cell 
diameter distance) to resident phagocytes. As not all T cells were visible during the 
entire observation period of 30 min, contact frequencies were calculated as follows: 
the total number of phagocytes contacted by an individual T cell was divided by 
the total number of T-cell displacements. The obtained value was extrapolated to 
30 min. For evaluation of T-cell activation in vivo, T cells with nuclear (translo- 
cated) or cytosolic (not translocated) NFAT sensor were quantified as previously 
described" by analysing fluorescent overlap between the green and red channel. 
Merged (yellow), translocated; not merged, not translocated. 

Mathematical analysis of chemotactic gradients. Target migration was analysed 
as follows. At first, the positions of individual motile T cells and meningeal mac- 
rophages from individual TPLSM recordings were extracted. For every Tmpp-cep 
cell entering a radius of 25, 50 or 100j1m from a macrophage all the motile steps 
were analysed. For each step we computed the direction relative to the examined 
phagocyte. In particular we computed the cosine between the normalized vectors 
AB and AC as shown in Extended Data Fig. 6c (the letter A corresponding to the 
initial position of the Typp-crp cell, B to its final position and C to the position of 
the phagocyte). Hence, the values 1 or —1 correspond to a Typp-cep cell moving 
towards the meningeal phagocyte or in the opposite direction, respectively. 
Analysis of T-cell migration. Analysis of the directionality of T-cell displacement. 
This analysis was performed by using standard scripts on three different time 
scales, the fastest sampling time (At= 32), and two slower times, 4At and 8At, 
respectively. The three analysed time scales gave the same results; therefore, exclu- 
sively data for the fastest time scale, that is, the larger data sets, are reported. 

For analysing T-cell movement in the leptomeninges, 3D displacement compo- 
nents were recorded. The directional analysis was split into two parts. Specifically, 
we first found the plane fitting the positions of all of the effector cells at all times; 
we then computed the normal and in-plane components of each displacement 
vector uj, 

Wi=Uuj-n, vj=Uu;—(uj-n)n 
The directional migration analysis was performed separately on these two com- 
ponents. Here n is the unit normal to the best fitting plane. 

Particularly for the sets of displacement normal components w;, we estimated 
the Wiener processes maximizing the log-likelihood function. The cell displace- 
ments normal to the best-fitting plane were well approximated by Wiener processes 
with very small positive drifts; the typical drift was 5-8% of the average normal 
displacement norm, that is, W = (|w;|) This means that the cells moved in a random 
way without following any preferential direction in the n direction. 

Analysis of effector T-cell movement type. The distribution of the displacement 
norm |u;| was analysed. The histograms of the probability density functions for the 
recorded data together with the estimated Weibull, Lévy and normal (Gaussian) 
distributions are shown on linear scales; for all the three cases this estimation was 
obtained by a global maximization of the log-likelihood function. For each dis- 
tribution the associated P values are reported. To compare the three hypotheses, 
probability plots were generated in which all of the estimated distributions were 
plotted against the data; the ideal line would be a straight line from (0; 0) to (1; 1). 
For all the cases investigated, the Weibull distribution was the most significant. This 
result was also confirmed by Kolmogorov—Smirnoy test. Hence, for our data, the 
tail of the displacement-normalized distribution decayed exponentially (Weibull 
distribution) and it was not well approximated by a polynomial decay typical of 
the Lévy distribution. Considering that (1) there was no evidence of preferential 
directions, and (2) the displacement-norm tail decayed in an exponential way, we 
concluded that the effector T cells moved in a Brownian random walk. 
Interference with integrin signalling. Intravital studies were performed during 
the early and established phases of leptomeningeal TMBP infiltration. To block 
integrin-mediated binding, a neutralizing mouse anti-rat monoclonal antibody 
against VLA-4 (anti-CD49d, clone TA-2)** and/or against CD11a (integrin aL, 
anti-LFA-1, clone WT.1, Serotec) was applied in the cisterna magna before or dur- 
ing intravital imaging at a single dose of 1 mg kg~'. After recording, ex vivo isolated 
T cells were tested by flow cytometry for antibody saturation. 

The effect of integrins on T cell motility in the leptomeninges in the absence 
of flow was investigated removing the dura mater and the subarachnoidea during 
intravital imaging. The subarachnoidal structures were left intact. Then, 30 min 
TPLSM recordings were performed before and after in situ application of VLA-4 
and LFA-1 monoclonal antibodies (201g each per rat). 

Interference with chemokine signalling. Intravital studies were performed dur- 
ing the early and established phases of leptomeningeal Tgp infiltration. Pertussis 
toxin A oligomer (10,.g kg~') (PTX, an exotoxin that blocks G-protein-coupled 
Gq, signalling and therefore acts as global chemokine inhibitor; List biologi- 
cal laboratories), 1mg kg~! Met-RANTES, 1 mg kg~! hamster anti-rat CKCR3 


LETTER 


monoclonal antibody (clone XR3.2), 0.5 mg kg~! AMD3100 (also known as 
plerixafor) (Sigma/Genzyme) or PBS were applied in the cisterna magna before 
or during intravital imaging. These monoclonal antibodies or blocking agents 
were shown to be effective in vivo in EAE models*?~“". The dose selected for the 
in vivo experiments was able to block T-cell migration in chemotaxis assays. In 
some experiments, pertussis toxin B oligomer, Armenian hamster IgG or mouse 
IgG1 isotype antibody (Abcam) were used as controls. To confirm the efficacy 
of the treatments after the imaging session, ex vivo isolated effector T cells were 
tested in chemotaxis assays towards the respective chemokines. 

Intravital fluorescence microscopy. For fast-acquisition fluorescence video 
microscopy, a LSM710/Axio Examiner Z1 microscope (Carl-Zeiss Microimaging) 
was used in combination with an HXP120C illuminator routed through a 
20x water NA1.0 immersion objective W Plan Apochromat (Carl-Zeiss 
Microimaging). Acquisition rate was 7-10 frames per minute. Fluorescent signals 
were detected using a Zeiss AxioCam HSM video camera. Mechanical hyper- 
ventilation was induced by increasing the ventilation rate in 200 g rats from 81 
to 100 breaths per minute (bpm) equivalent to the ventilation rate of 100 g ani- 
mals. Cardiac vagolytic effect was achieved by subcutaneous administration of 
the muscarinic receptor antagonist methylscopolamine (0.05 mg kg!) 30 min 
before imaging”. 

Block of T-cell transport from the fourth ventricle to the subarachnoidal space. 
Two different procedures were performed. CSF flow obstruction was obtained by 
injecting growth-factor-reduced Matrigel (BD Bioscience; 160,11 for 140-170g 
animals) intracisternally as previously described*. To assess the success of the 
Matrigel obstruction, we mixed Evans blue dye into the Matrigel and visually 
confirmed that the Matrigel not only enclosed the medulla oblongata but also 
completely filled the entire fourth ventricle (Extended Data Fig. 2a, left panel). 
Visual inspection of Matrigel localization in and around the fourth ventricle was 
performed in all animals of each experiment. 

A spinal cord window was opened at the cervical level before intravital imaging. 
Then, the dura was removed and a small hole inserted in the arachnoidea, allowing 
the CSF to leak out. The CSF efflux induced a collapse of the subjacent thoracic 
subarachnoidal space and thereby abolished the CSF flow in this area (Extended 
Data Fig. 2c, d). 

Intrathecal injection of T cells. Effector T cells were either cultured in vitro and 
injected on day 2 (activated cells) or on day 6 (resting cells) after antigen stimu- 
lation, or isolated ex vivo from spinal cord parenchyma or CSE. Effector T cells 
from parenchyma were purified by a two layer OptiPrep (Axis-Shield PoC) density 
gradient. To block chemokine or integrin signalling T cells were incubated with 
either (1) PTX (0.21.g ml7}); (2) anti-rat monoclonal antibodies against VLA-4 
(0.4mg ml!) or LEA-1 (0.4 mg ml~}); (3) mouse IgG2a (0.4mg ml) isotype 
control, or (4) cell culture medium alone for 1h at 37°C and washed thoroughly 
before injection. Typp-crp OF Tova-cherry/Tova-cep Cells (0.8-2 x 10° ex vivo cells 
or 5 x 10° in vitro cultured cells) were injected into the cisterna magna in a total 
volume of 40-50 il. In some experiments T cells were injected by lumbar puncture 
at the level L3-L4 as described™. Cells pre-incubated with antibody were tested by 
flow cytometry for antibody saturation before injection and after retrieval. 

RNA extraction, cDNA library preparation and RNA-seq. RNA extraction, 
cDNA library preparation and RNA sequencing was undertaken as previously 
described*’. Total RNA was purified using the TRIzol protocol (Invitrogen) from 
6 different samples: in vitro Typp-crp cell blasts (20h after antigen encounter), 
in vitro Tupp-cep resting T cells (day 6 after antigen encounter); ex vivo Typp- 
cep cells from blood, CSF, spinal cord leptomeninges and parenchyma collected 
3 days after transfer. Between 200,000-400,000 Tpp-crp cells were sorted from 
each sample with a BD FACSAria 4L SORP with more than 98% purity. Three 
different biological replicates were prepared for each sample. Five to seven ani- 
mals were pooled for each ex-vivo sample. Library preparation for RNA-seq was 
performed using the TruSeq RNA Sample Preparation Kit (Illumina) starting 
from 500 ng of total RNA. Single read (45 bp) sequencing was conducted using a 
HiSeq 2000 (Illumina). Fluorescence images were transformed to BCL files with 
the Illumina BaseCaller software. Samples were demultiplexed to FASTQ files with 
CASAVA. Sequencing quality was checked and approved via the FastQC software. 
Sequences were aligned to the genome reference sequence of Rattus norvegicus 
using the STAR alignment software“ allowing for 2 mismatches within 45 bases. 
Subsequently, conversion of resulting SAM files to sorted BAM files, filtering of 
unique hits and counting was conducted with SAMtools*” and HTSeq**. Data 
was preprocessed and analysed in the R/Bioconductor environment using the 
DESeq?2 package”. Candidate genes were filtered to a minimum of twofold change 
and FDR-corrected P value < 0.05. Gene annotation was performed using Rattus 
norvegicus entries from Ensembl v78 via the biomaRt package*’. KEGG pathway 
analysis was performed by using The Database for Annotation, Visualization and 
Integrated Discovery (DAVID)*!. 
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Quantitative PCR. mRNA extraction, cDNA synthesis and quantitative PCR were 
performed as described!*. The following rat-specific primers and FAM-TAMRA- 
labelled probes not previously described”'**? were used. Vascular cell adhesion 
molecule 1 (VCAM-1): forward, 5/-TGAGGCTGGAATTAGCAAAAAAT-3', 
reverse, 5/-TTAGATGGGAAGACTGTAAGTTGTATG-3’, probe: 5/-CTGATTAT 
CCAAGGCTCTT-3’; intercellular adhesion molecule 1 (ICAM-1): forward, 
5'-GAAGACAGCAGACCACTGTGCTT-3’, reverse, 5/-CTCGCTCTGGGGAA 
CGAATACA-3’, probe: 5‘-ACTGTGGCACCACGC-3’; fibronectin transcript-1 
(fn-1): forward, 5‘-TGATCTTTGAGGAACATGGCTTT-3’, reverse, 5’/-GCAGGT 
ATGGTCTTGGCCTAAG-3’, probe, 5’/-AACCACGCCACCCACTGCGG-3'; 
fibronectin transcript-2 (fn-1): forward, 5’-GAGCTTCCCCAACTGGTTAACC-3’, 
reverse, 5/-GAACTGTGGAGGGAACATCCAA-3’, probe, 5’/-CCACACCCCAA 
TCTTCATGGACCAGA-3’. Unless otherwise stated, 8-actin was used as 
house-keeping gene. 

In vitro proliferation assay of T cells. Amplification of ex vivo or in vitro isolated 
Twsp-cep Cells upon antigen stimulation was tested by flow cytometry as previously 
described’®. 

Histology and immunohistochemistry. Histological analysis was performed 
as described™ using the following antibodies: rabbit anti-CXCL-11 (Biorbyt), 
mouse anti-rat RECA-1 (Abcam), guinea-pig anti-Iba-1 (Synaptic Systems), 
mouse anti-rat OX-6 (rat RT1B MHC class II antigen, Serotec), mouse anti- 
alpha-smooth muscle actin (DAKO). eGFP was enhanced by GFP-Atto 488 
(Chromotek). Alexa Fluor568 goat anti-rabbit IgG, Alexa Fluor647 goat anti- 
mouse IgG (both from Invitrogen), Alexa Fluor647 donkey anti-guinea pig 
(Dianova) were used as control. Images were acquired using either a Zeiss Axio 
Observer fluorescence Microscope equipped with a 10 Air Objective or a Zeiss 
LSM700 confocal microscope equipped with a 40x or 63x Zeiss objective. For 
histological quantification of Tpp-cep cells in CNS areas, consecutive cryosec- 
tions (16 jum) were acquired with a VS120 Virtual Slide Microscope (Olympus) 
equipped with a 10x objective. Counting of Tygp_crp cells and area measure- 
ments were performed with Fiji software*’. For CNS explant imaging, lumbar 
spinal cord and fourth ventricle choroid plexus were prepared. Images were 
acquired by TPLSM. For detection of integrin ligands in vivo on leptomeningeal 
structures dura mater and arachnoidea were removed leaving the subarachnoi- 
dal structures intact. PE-labelled VCAM or ICAM antibodies (BioLegend) and 
the respective isotype controls were then incubated in situ for 30 min before 
confocal acquisition. 

Electron Microscopy. Animals were perfused with saline followed by a fixative 
containing 4% PFA and 1% glutaraldehyde for ultrastructural analysis and 4% PFA 
and 0.1% glutaraldehyde for immuno-electron microscopy. After post-fixation the 
spinal cord was cut into 60 |1m sections using a Leica vibrating microtome. For 
immuno-electron microscopy, goat anti-GFP antibody (Acris antibodies) was used 
to detect the GFP signal. Biotinylated rabbit anti-goat antibody (Sigma Aldrich) 
was used as secondary antibody. The sections were then incubated with ExtrAvidin 
(Sigma-Aldrich) and stained with diaminobenzidine (DAB, Sigma-Aldrich) to 
achieve an electron dense precipitate. Omitting primary antibodies resulted in the 
absence of staining. After that, the sections were stained in 0.5% osmium tetrox- 
ide, dehydrated and embedded in durcupan (Sigma-Aldrich). Regions of interest 
were identified by light microscopy, cut and transferred onto blocks of resin to 
obtain ultra-thin sections using an Ultramicrotome (Leica). Ultra-thin sections 
were transferred on formvar-coated copper grids and stained using lead citrate. 
Ultrastructural analysis was performed using a Zeiss SIGMA electron microscope 
equipped with a STEM detector and ATLAS software. 
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Extended Data Figure 1 | See next page for figure caption. 
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Extended Data Figure 1 | Typp cells enter the CSF from the 
leptomeninges. a, Tyypp cell entry into the CNS compartments. 

Tmpp-cerp Cells were quantified by histological analysis in the choroid 
plexus of all four ventricles (1st-4th vent) and spinal cord (meninges, 
white matter (WM) and grey matter (GM)) and by flow cytometry in 

the CSF. Histology of 6-27 consecutive slices per compartment. n.d., 

not detectable. Representative experiment + s.e.m. from 2 independent 
experiments (1 = 12) from days 1.5-5 after transfer. b, Accumulation 

of Tpp_cep Cells in the CSF and leptomeninges occurs simultaneously. 
Numbers of Typp-crp cells (flow cytometry) in meninges and CSF. Results 
of 3 independent experiments. Data are mean + s.d. of 2-4 animals per 
time point (1 = 61). c, Typp cells accumulate in the leptomeninges but 
not in the choroid plexus. Confocal laser scanning microscopy of fixed 
tissue sections of the lateral, third and fourth ventricles and fluorescence 
microscopy images of lumbar spinal cord sections. Images were recorded 
from the same animal for each time point. Br. st., brain stem; green, Tpp_crp 
cells; blue, DAPI counterstain of nuclei. Right (I-XII), magnifications of 
areas of interest. Representative images of 14 different time points from 
days 1.5-5 after T-cell transfer. d, Localization of Typ cells in choroid 
plexus explants compared to spinal cord leptomeninges. Original TPLSM 
images and 3D reconstructions of explanted choroid plexus of the 

fourth ventricle. The corresponding lumbar spinal cord leptomeninges 
(lumbar SC) was acquired before explantation of the choroid plexus. 


Green, Typp-cep Cells; red, meningeal blood vessels; blue, collagen. Few 
Tmpp-crp Cells can be detected within the choroid plexus tissues compared 
to the leptomeninges of the spinal cord. Red arrows, Typp_crp cells 
within the choroid plexus stroma; white arrows, Tyypp_cep Cells outside 

of the choroid plexus tissue. Representative pictures of 2 independent 
experiments (nm = 10). Scale bars, 100 1m (c, d). e, Tapp cells accumulate 
in the CSF before appearing in the choroid plexus. Quantification of 
Twpp-crp Cells from the choroid plexus of the fourth and lateral ventricles 
and from the CSF (flow cytometry). Means + s.d. Representative data of 

3 independent experiments including 2—4 animals per time point. 

f, Anti-GFP immuno-electron microscopy confirms the scarcity of 

Tmpp cells in the choroid plexus. Upper left, choroid plexus of the fourth 
ventricle. The choroid plexus epithelial cells are colour-coded in yellow. 
The lamina propria between epithelial basal lamina (blue) and vascular 
basal lamina (red) is filled with loose connective tissue and meningeal 
fibroblasts. Arrow heads, endothelial nuclei of cross sectioned vessels (V). 
Top right, ependyma. The surface of the ventricular ependyma (arrows) 
appears smooth and continuous with numerous ciliary processes. No Tpp 
cells could be detected for any of the observed time points after T-cell 
transfer. Lower panel, leptomeninges. Typp_crp cells (T) are marked by 
typical black grains of DAB. Yellow, resident leptomeningeal cells; blue, 
bundles of collagen fibres; red, vascular endothelial basal laminas. M®, 
macrophages; L, vascular lumen. 
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Extended Data Figure 2 | Interference with Tygp cell transport within 
the CSF does not inhibit their accumulation in the spinal cord and 
clinical EAE. a, Typp cell transport from the cisterna magna to the 
subarachnoidal space of the spinal cord is effectively blocked by Matrigel. 
Left, macroscopic image of the cisterna magna and the adjacent parts 

of the CNS after injection of Matrigel mixed with Evans blue dye. Cb, 
cerebellum; Ce, cerebrum; SC, spinal cord. Arrow, cisterna magna filled 
with blue-stained Matrigel. Right, PBS or Matrigel was injected in the 
cisterna magna of naive animals. T\pp_crp cells were injected 24 h later 
intra-cisternally. TPLSM of the medulla oblongata and the cervical 
spinal cord was performed 6 h after the Tpp_crp cell injection. Shown 
are 3D reconstructions of the TPLSM recordings. Notably, in Matrigel- 
injected animals the Tyypp cells remained localized in the cisterna 

magna but did not reach the cervical spinal cord, indicating that the 
Matrigel efficiently blocked the migration of cells from the cisterna to 
the subjacent leptomeninges. In controls (PBS-injected animals), Tgp 
cells readily reached the leptomeninges of the cervical spinal cord. Green, 
Tmpp-cep Cells; blue, collagen. Scale bars, 50|1m. b, Leptomeningeal T-cell 
infiltration and clinical disease are not impaired after Matrigel blockage. 
Matrigel was injected into the cisterna magna 2 days after i.v. transfer 

of Tmpp-cep Cells. PBS i.t.-injected animals were used as control. Left, 
TPLSM recordings of thoracic leptomeninges 3.5 days after transfer. 
Representative data of 2 independent experiments. Scale bar, 200 1m. 
Right, clinical score assessment. Representative data of 2 independent 
experiments including 4 animals per treatment (n = 16). Mean +s.e.m. 
c, Twpp cell transport from the cisterna magna to the subarachnoidal 
space of the spinal cord is effectively blocked by interrupting the CSF 
flow. Resting T cells (5 x 10°) were injected into the cisterna magna 
either in control animals (upper panel) or in animals where the CSF flow 
was interrupted by introducing a CSF leakage at the level of the cervical 
spinal cord (lower panel). The efflux of the CSF induced a collapse of the 


subjacent thoracic subarachnoidal space (SAS) and thereby abolished 

the CSF flow in this area. TPLSM recordings of the thoracic spinal 

cord and of the medulla oblongata 6h after i-t. injection. Depicted are 

3D reconstructions of TPLSM recordings of the thoracic spinal cord 
leptomeninges, overviews of the same area with magnifications of areas of 
interest (I, II) and overviews of the medulla oblongata. Green, Tpp_cep 
cells; red, blood vessels; blue, collagen. Arrows, representative T\pp_Grp 
cells. Scale bars, 300 1m for thoracic spinal cord and 501m for medulla 
oblongata. Note that in the animals with disrupted CSF flow, no Typp cells 
were detectable at the level of the spinal cord despite the high number of 
Topp cells in the medulla oblongata indicating that the leakage completely 
blocked the transport of cells from the cisterna magna to the subjacent 
spinal cord leptomeninges. d, e, Interrupting the CSF flow does not impair 
Twpp cell accumulation in the spinal cord leptomeninges. d, Shown are 

3D reconstructions of TPLSM recordings of the thoracic spinal cord 
leptomeninges after iv. transfer of Tpp_crp cells. Images before (70h after 
transfer) and after introducing a CSF leakage at the level of the cervical 
spinal cord (82h after transfer). Animals with intact flow were used as 
control. Green, Tpp-crp Cells; red, meningeal blood vessels; blue, collagen. 
Scale bars, 300m. e, Left, TPLSM overviews of the thoracic spinal cord 
leptomeninges. The numbers of Typp_cep cells that had emigrated from 
the local vessels into the leptomeningeal milieu (red false colours) within 
the 12h interval did not differ between controls and ‘CSF flow interrupted’ 
animals. Green and red marks (overlaid red dots), Tyypp cells located 
inside or outside the vessels, respectively; blue (false colour), meningeal 
blood vessels. Scale bars, 200 1m. Right, relative increase of Typp_cep cells 
located in the extravascular compartment at the indicated time points 
during imaging either in control animals or in animals where the CSF was 
interrupted (leakage). Representative data of 3 independent experiments 
(Kruskal-Wallis ANOVA followed by Dunn's multiple comparison test, 
*P< 0.05). Mean+s.d. 
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Extended Data Figure 3 | Tgp cells entering the leptomeningeal milieu 
from local vessels become detached and float within the CSF. a, Tyypp 
cell extravasation from the leptomeningeal vessels. Left, TPLSM during 
the pre-invasion phase. Representative Typp cell diapedesis. Depicted 

is a very early transmigration event, that is, the majority of both Typp 
cells and CD11b* myeloid cells are located inside the vascular lumina. 
Red, meningeal blood vessels; blue, Typp-titeact-Turquoise2 Cells; white, 
CD11b* myeloid cells labelled by i.v. administration of fluorescently 
conjugated anti-CD11b monoclonal antibody during imaging. Scale bars, 
10m. Right, immuno-electron microscopy of Typp cell extravasations 
observed at the indicated time points after transfer. Yellow, endothelial 
layer of leptomeningeal vessels; red, endothelial basal lamina. Typp_crp 
cells are marked by cytoplasmic DAB. Extravasating Tyypp cells were 
often found to be surrounded by endothelial processes. Thereby, notable 
abluminal or luminal gaps (arrow heads) of the endothelial layer 

were frequently observed. Representative electron micrographs at the 
indicated time points. L, vascular lumen; T, T cells. b, Surface-rendered 
3D reconstruction of the meningeal compartment. TPLSM recording in 
the early phase of leptomeningeal Typp cell infiltration. Red, meningeal 
blood vessels; green, T\pp_crp Cells; blue, collagen. Scale bar, 300 zm. 

c, Visualization of a Typp cell detaching from the leptomeninges into 

the CSF. Intravital video-microscopy during the established phase of 
leptomeningeal Tyypp cell infiltration depicts a Tpp_crp cell (false colour 


yellow, indicated by the white arrow) in the process of detachment 

from the pial surface. Yellow and blue lines, crawling and rolling steps, 
respectively; dotted grey arrows, direction of the CSF flow. Scale bars, 
50m. d, In situ light exposure induces efficient and high contrast 
photoconversion of Typp_Dendra2 Cells without influencing T-cell motility. 
Images show representative TPLSM leptomeningeal overviews during 
established leptomeningeal Tyypp cell infiltration before (0 min), 5 min, 
and 120 min after photoconversion. Red, meningeal blood vessels; 

green, non-photoconverted Typp_Dendra2 Cells; yellow, photoconverted 
Tpp_Dendra2 Cells; blue and white circles, representative non-photoconverted 
or photoconverted Typp_Dendra2 Cells, respectively. Plotted are 

mean velocities of individual Tyypp_pendra2 Cells before and 2h after 
photoconversion. Data refer to 30 min time-lapse recordings. Mean 
values from 239 cells from at least 3 independent experiments (two-tailed 
Mann-Whitney U-test). e, Rapid turnover of effector T cells in the 
leptomeninges. Intravital TPLSM recordings on leptomeninges during the 
established phase of leptomeningeal Typp cell infiltration. Images show 
distribution of Typp_Dendra2 Cells before or 4 h after photoconversion. For 
better visualization, red and green dots were overlaid onto photoconverted 
and non-photoconverted T cells, respectively. Grey (false colour), vessel 
lumen. Scale bar, 100 pm. Right, relative changes of photoconverted versus 
non-photoconverted T\gp_Dendra2 Cells over time obtained from intravital 
recordings. Representative data of at least 3 independent experiments. 
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Extended Data Figure 4 | CSF-derived Typp cells do not represent a 
specialized sub-population compared to Tyypp cells in the meninges 
and CNS parenchyma but they display a lower activation profile. 

a, Similarities in transcriptome profiles between Typp cells in blood, CSE 
leptomeninges and CNS parenchyma. RNA-seq of Typp_cep Cells sorted 
from blood, meninges (Men), CSF and CNS parenchyma (Par) 3 days after 
transfer and on in vitro cultured Typp_cep cells sorted 20 h and 6 days after 
antigenic stimulation (blast and resting T cells, respectively). Principal 
component analysis of the transcriptomes for all six T cell populations 
(left) and for the four ex vivo populations (right) show similar profiles of 
Topp-crp Cells in the CSF and the other CNS compartments compared to 
blood and culture. Numbers in parentheses, proportion of total variability 
calculated for each principal component. Each data point represents a 
biological replicate. b, Typp cells display similar TCR repertoires in the 
CNS compartments. Top, RNA-seq as in a. Normalized expression of 
invariant TCR complex components (left) and mean frequencies of the 
TCR VB genes (right) determined as the proportion of RNA-seq reads 
that map to certain V3 segments among all reads mapping to the entire 
set of 24 rat VB segments. Shown are the 6 most abundant Vf segments. 
Bottom, Typp-crp cells from culture (4 days after antigen stimulation), 
blood, or from the indicated CNS compartments on day 4 after T-cell 
transfer. Expression (flow cytometry) of different TCR V@ chains. 
Percentages of the different V6 chains for three different Typp cell lines. 


Ex vivo cell staining containing pooled cells from the respective organs 
of 3-4 animals (n = 10). c—e, Tyypp cells within blood and CSF are not 
activated in contrast to Tgp cells in meninges and CNS parenchyma. 

c, Amounts of IFNy and IL-17 (quantitative PCR) in Typ cells from blood 
or the indicated CNS compartments at the indicated time points after 
transfer. Data are mean + s.d. of duplicate measurements. Representative 
data of 4 independent experiments including at least 2 animals per group 
per time point. n.d., not determined owing to lack of cells within the CNS 
parenchyma at the early leptomeningeal Typp cell infiltration phase. 

d, Cell surface expression of activation markers CD25 and CD134 (flow 
cytometry) of Tpp-cep cells at day 4 after transfer. Black, isotype control; 
red, CD25 or CD 134. Representative data of 4 independent experiments 
each combining cells from 4 or 5 animals for each compartment. 

e, Intracellular IFNy and IL-17 production in Typ cells isolated during 
the indicated phases of EAE. Representative data of 4 independent 
experiments each combining cells from 4 or 5 animals for each 
compartment (n= 17). f, CSF-derived Typp cells produce cytokines upon 
antigenic stimulation in CSE. Tyypp-_crp cells isolated at day 3 after transfer 
from CSF or cultured resting Typp_crp cells were stimulated in vitro with 
MBP in CSF from naive rats. IFNy and IL-17 production (quantitative 
PCR). Data are mean + s.d. of duplicate measurements representative of 

2 independent experiments (1 = 8). House-keeping gene, 3-actin (Actb) (c, f). 
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Extended Data Figure 5 | See next page for figure caption. 
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Extended Data Figure 5 | The trafficking of effector T cells in the 

CNS compartments depends on their reactivation levels and the 
inflammatory state in the CNS. a, Tyog cells have a low encephalitogenic 
potential and display low reactivation levels in the CNS. Left, IL-17 and 
IFN expression measured by quantitative PCR in T\joc_crp cells. House- 
keeping gene, 3-actin (Actb) (a, c, d). Representative data of 3 independent 
experiments (data are mean + s.d. of duplicate measurements). Note, 
Tmog cells in meninges and CNS parenchyma produce less cytokines 
than the highly pathogenic Tyypp cells (Extended Data Fig. 4c), but more 
than non-pathogenic Toya cells (Extended Data Fig. 4d). Right, clinical 
course after Tyoc_crp Cell iv. transfer. Cumulative data of 3 independent 
experiments including at least 4 animals per group. Tyjog cell CNS 
infiltration in this model considerably precedes disease onset (Fig. 2e). 

b, Morphological analysis of the distribution of effector T cells with 
different antigen specificities and reactivation/pathogenic potentials in 
the CNS. Fluorescence-microscopy-derived overviews and magnified 
subsets (I-IV) of thoracic spinal cord sections showing distribution of 
fluorescently labelled T cells with the indicated antigen specificity 4 days 
after T-cell transfer. Open white and closed yellow arrows, representative 
T cells located in the meningeal compartment or in the CNS parenchyma, 
respectively. Scale bars: overviews, 200 1m; magnification, 100 jim. 


The infiltration behaviour of brain-non-reactive Toya cells is similar 

to that of pathogenic Typp cells when the CNS tissue is inflamed (after 
co-transfer of the Toya cells with Tyypp cells). c, Tova cells encounter an 
inflammatory milieu in the meninges when co-transferred with Tpp 

cells. Quantitative PCR for the indicated chemokines, integrin ligands and 
cytokines was performed on meninges and CNS parenchyma isolated from 
naive animals (white columns) or from animals that 4 days previously had 
received either Tova-cherry cells (grey columns) or Tova-cherry cells together 
with Tygp_cep cells (black columns). d, When co-transferred with Typp 
cells, Toya cells do not show signs of activation in the CNS. IFNy and 
IL-17 expression (quantitative PCR) in Twgp-crp or Tova-cherry Cells sorted 
from the indicated compartments in animals transferred 4 days previously 
with these cells. Representative data of 3 independent experiments; data 
are mean + s.d. of duplicate measurements (c, d). e, Tova cells are released 
from the meningeal compartment into the CSF in higher numbers than 
Topp cells. Quantification of Tova-cherry cells and co-injected Tpp-GEP 
cells during established leptomeningeal Tyygp cell infiltration (flow 
cytometry). Representative data of 3 independent experiments including 

3 animals per group. Data are mean + s.e.m. (two-tailed Mann-Whitney 
U-test). 
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Extended Data Figure 6 | Typp cells in the leptomeningeal milieu 
closely interact with resident macrophages. a, Composition of the 
leptomeningeal milieu and interactions of Typp cells with resident 
macrophages. Upper left, representative still and time-encoded 
projection illustrating, via a time colour-code scale, the area that 

single leptomeningeal macrophages scan with their processes over a 
recording period of 10 min (32 s time intervals). Green, GFP t meningeal 
macrophages in GFP t bone marrow chimaeras’; red, blood vessels. Upper 
right, original TPLSM picture and surface-rendered 3D reconstructions 
of a representative spot within the meningeal milieu during established 
leptomeningeal Typp cell infiltration. Green, extravasated motile Twpp_crp 
cells; red, meningeal phagocytes (M®); yellow trajectories, migration 
paths of individual Tpp-crp cells; blue, collagen fibres. Magnification of 
an individual region (white dotted rectangle) indicates contacts (yellow 
arrows) between Twpp-cep Cells and meningeal phagocytes. Lower left, 
Tmpp cells in direct contact (yellow) or not in contact (green) with 
meningeal macrophages (red). Surface-rendered 3D reconstruction of 

a representative still from a 30 min TPLSM recording. Lower right, a 
representative Typp_cep effector cell (red, false colour) migrating within 
the leptomeningeal compartment contacting several resident meningeal 
macrophages (1-5, yellow contours). Orange line, migration path of the 
T cell; green, Tapp_crp Cells; grey, vessel lumen (false colour). Scale bars, 
501m. b, Ultrastructural analysis confirms direct and intense contacts 
between T cells and meningeal macrophages. Representative overview of 
cells within the spinal cord leptomeninges illustrating the organization of 
the leptomeningeal milieu at the ultrastructural level. Lymphocytes (L) 

in the subarachnoidal space are located between or adjacent to collagen 
fibrils (arrow heads) and in direct vicinity of phagocytes (M®) or local 


resident cells (M). Right, higher magnification of a lymphocyte in close 
contact with a macrophage. c, Typp cells do not follow chemotactic 
gradients towards meningeal phagocytes. Vector analysis of the migration 
steps of Typp-cep Cells in relation to a fixed macrophage in areas with 

a radius (g) of 25, 50 or 100m. The cosine of the angle between the 
macrophage-T-cell axes (AC) and the migration vector (AB) was 
calculated for each step. The values 1 or —1 correspond to a Typp_cep cell 
moving towards the meningeal phagocyte or in the opposite direction, 
respectively. Numbers indicate average direction + s.d. of the steps 
examined for each radius. The average is always close to zero and the 
standard deviation is very broad, excluding the presence of targeted 
migration. Representative data of 4 independent experiments for each 
radius including 4396, 15870 and 57535 steps for p= 25, p=50 and 

p= 100m, respectively. d, Integrin ligands are highly expressed in 
meningeal phagocytes. Bar plots show quantitative PCR data for the 
indicated integrin ligands on meningeal phagocytes labelled with Texas 
Red sorted from naive animals or during the early or established phases 
of leptomeningeal Tyypp cell infiltration. House-keeping gene, 3-actin 
(Actb). Mean + s.d. of duplicate measurements. Histogram plots show 
corresponding protein expression (flow cytometry) during the established 
phase of leptomeningeal Tyypp cell infiltration. Bottom, representative 
confocal images of leptomeningeal spots acquired in naive GFP * animals 
or in GFP* animals with established leptomeningeal Typ cell infiltration 
after in vivo staining with the indicated antibodies. Single fluorescent 
channels and merged pictures. The expression of the tested integrin 
ligands is enriched in the vicinity of the leptomeningeal blood vessels in 
naive tissue (left) but widely distributed during the established phase of 
leptomeningeal Typp cell infiltration (right). 
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Extended Data Figure 7 | See next page for figure caption. 
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Extended Data Figure 7 | Locomotion behaviour of Tyypp cells in the 
leptomeningeal milieu and its regulation by VLA-4/LFA-1 integrins. 
a, Tpp cells express high level of integrins. VLA-4 (Itga4) and LFA-1 
(Itgal) expression (quantitative PCR) on Typp_crp cells from meninges 
or CNS parenchyma at the indicated time points of leptomeningeal 
T-cell infiltration. Representative data of 3 independent experiments. 
Data are mean +s.d. b, VLA-4 is expressed in the active conformation 
on Tyypp cells in the leptomeninges but not in the CSE. HUTS-4 antibody 
(directed against the activated conformation of 31-integrins), activation- 
independent anti-VLA-4 antibody or control antibody were injected i.t. 
into Lewis rats during the established phase of leptomeningeal T-cell 
infiltration. Tpp_cep Cells were isolated from meninges and CSF 4h 
later; protein expression was measured by flow cytometry. Percentages 
of positive T cells are indicated. Representative data of 3 independent 
experiments. c, Integrin blockade accelerates T-cell migration. Dot plots show 
Tmpp-cep Cell velocity before (control) and 4 h after i.t. treatment 

with the indicated anti-integrin monoclonal antibodies. Data from 30 min 
TPLSM recordings on leptomeninges during the early phase of Typp 

cell leptomeningeal infiltration. Data are mean values of individual 

T cells from at least 3 independent experiments including 674 cells 
(Kruskal-Wallis ANOVA followed by Dunn's multiple comparison). 

d, Topp cell velocity is not influenced by integrins in the absence of CSF 
flow. Velocities of Tpp-cep cells in the leptomeningeal milieu analysed 
during the established phase of leptomeningeal Tyypp cell infiltration 
after interruption of the CSF flow in the presence or absence (control) of 
anti-integrin monoclonal antibodies. The interruption of the CSF flow 
was achieved by removal of the arachnoidea causing the efflux of the CSF 


from the subarachnoidal space. 30 min TPLSM recordings. Data are mean 
values of individual T cells from 3 independent experiments including 
166 cells (two-tailed Mann-Whitney U-test). e, Integrin blockade does not 
change the motility pattern of Typp cells. Mathematical analyses of at least 
10 TPLSM recordings for each treatment (as in Fig. 3a, b). The Brownian 
walk of effector Typp_crp cells was not changed after it. application 

of anti-LFA-1/VLA-4 monoclonal antibodies. f, Effects of integrins 

on Typp cell adhesion to the leptomeninges during the early phase of 
meningeal infiltration resemble those during the established phase. 30 min 
intravital TPLSM recordings on leptomeninges during the early phase 

of leptomeningeal Tyypp cell infiltration. Animals were treated it. with 
PBS (control) or with the indicated anti-integrin monoclonal antibodies. 
Bar plots show number of Tyrpp-_crp cells 4 h after treatment in CSF and 
meninges (flow cytometry). At this time point no Typp_crp cells were 
detectable in the CNS parenchyma (Kruskal-Wallis ANOVA followed by 
Dunn's multiple comparison test). Images show representative stills from 
30 min TPLSM recordings and corresponding time projections of T-cell 
tracks before and 4 h after the indicated treatment. T-cell detachment is 
indicated by the reduction of Tyypp cell trajectories after it. monoclonal 
antibody treatment. Blue, Tupp-iteact-Turquoise2 Cells. Scale bar, 501m. 
Representative data of 2 independent experiments. g, Integrin blockade 
increases the turnover of Typp cells in the leptomeninges. Quantification 
of photoconverted Typp_Dendra2 Cells after i.t. application of anti-LFA-1/VLA-4 
monoclonal antibodies or control IgG (control) during TPLSM of 

the established phase of leptomeningeal Typp_pendra2 Cell infiltration. 
Representative kinetic of at least 3 independent experiments. *P < 0.05, 
**P< 0.01, ***P< 0.001 (c, f). 


© 2016 Macmillan Publishers Limited. All rights reserved 


a Sorted Typp-crp cells b Total tissue 
Meninges Parenchyma Meninges Parenchyma 

5 0.12 0.12 [: Cor7, 5 0.20 0.25 Col19 
B i Cer5 3 0.20 Bexcig 
2 DB Cxcr4a @ 0.15 i Cxcl10 
S 0.08 0.08; io WiCxcl71 
@ s 0.15 WCcl5 
s = 010 oto ECxclt2 
e 0.045 0.04 - ‘ 
e = 0.05 0.05 
B oy 
a =! 2 ex og. ~ 2 oL 

c Early Established Early Established Naive Early Established Naive Early Established 

es Meningeal phagocytes 

3 
58 Cxcl9 90) Cxcli0 35 Cxcl11 3 Col5 12 Cxcl12 
26 28| 
3 y 60, 24 2| 8 
< 30) ig 4| 4 
Zz 2 7 
Eo oe op ee ose 0 = 
< re Early i. Naive Early Established ~ Naive Early Established Naive Early Established Naive Early Established 
(v4 


| 


e CSF Meninges Parenchyma 
© ~ as 
° 0.5 ¢ ee 0.20 Pr ©S 40 
& *] = 
z o4 S 0.15 | = 30 
8 03 5 5 
ia @ 0.10 @ 20 
g 02 B 8 
e - 0.05 - 10: 
3 0-4 va 5 
S Ss So 
2 05x 2 sx 2 05x 
eb 26 eb 
6 8 S 
(s) 3) (S) 


| 
7 : 


DAPI 


Meandering index 


Meandering index 


seek 
8 ee a 
= > 5 T e 2 
= 2 € 
= S 4 = - 
2) s z S 
=| = = 
oy § £3 = 
wu} 3s 3 3 
3 2 Ss 
3] g © 
sg ea > 
§ 6 1 
13) 
0 0 
* 
sik 
10- 6, oe 7 : 
_ kik T A 8 
€ a “ ag 
Bl € 8 > 5 z . 
oO iS 3 = ‘ 7 
£l = 5 = 3 : & Ee 
Bl gel 4 E Zs 
Ss] ©& © > Ee 
| 5 — 3 Bi? = 
o| 74 3 8 SY 
3 & 2 ei 7 © = 
wy & < 2 
5 2] 81 
Oe 0 
ie oo > x 2 HO 
PEgBS SE egs PES ES 
eaOCrESR eQOOE®Q 5 QF 
5 xz io} x ZB 6 @ <2 
1) oc oO Oost z = 
Zoe 2 Zor es Z © S 
eect et eg es 
at 5 3 < 3 
=> == 2 
h - Control IgG 


3 


=>Z-sections => 5! 


+ Control IgG 


3 


=>Z-sections => 5! 


o 
ES 


Extended Data Figure 8 | See next page for figure caption. 
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Extended Data Figure 8 | Role of chemokines in Typ cell motility in 
the leptomeningeal milieu. a, Chemokine receptor expression profile of 
Topp cells during EAE. Quantitative PCR data for the indicated chemokine 
receptors from Twpp-crp Cells isolated from spinal cord meninges and 
parenchyma during the early and established phases of leptomeningeal 
Tmpp cell infiltration. Representative data of 3 independent experiments. 
Data are mean + s.d. of replicate measurements. b, c, Inflammatory 
chemokines are upregulated in the CNS milieus and in meningeal 
macrophages during EAE. Quantitative PCR in naive animals or in 
animals during the early and established phases of leptomeningeal Tgp 
cell infiltration. Chemokine ligand expression either on meninges and 
CNS parenchyma (b) or on Texas-Red-labelled meningeal phagocytes 
isolated by flow cytometry (c). House-keeping gene, (}-actin (Actb) (a-c). 
Representative data of 3 different experiments. Data are mean + s.d. 

d, Tupp cells establish direct contact with CXCL11*, MHC-II* meningeal 
phagocytes. Leptomeninges during early leptomeningeal Tyypp cell 
infiltration stained for CXCL11, Iba-1 and DAPI (upper row) or CXCL11, 
MHC-II and DAPI (lower row). Green, Typp_crp cells. Scale bars, 50 tm. 
In the magnifications (right) yellow arrows indicate direct contacts 
between Tpp-cerp cells and CXCL11 * meningeal phagocytes. Scale bars, 
20 um. e, Ga; signalling blockade induces a release of Typ cells from the 
leptomeninges into the CSF. Left, plots show flow cytometry quantification 
of Tmpp-crp cells during established leptomeningeal Typp cell infiltration 
in the indicated compartments 4h after i.t. treatment with PBS (control) 
or PTX. Right, intravital image of spinal cord leptomeninges recorded 4h 
after i.t. injection of PTX during the established phase of leptomeningeal 
Tmpp Cell infiltration. During the recording time, two Typp_crp cells 
highlighted in orange were released into the CSF. Grey (false colour), 
leptomeningeal blood vessels, phagocytes; green, T\pp_cep Cells; orange 
dotted lines, tracks of individual Typp_crp cells before being washed away 
into the CSF; blue dotted lines, contour of individual resident phagocytes; 
white arrows, direction of CSF flow. Scale bars, 501m. f, Interference 


with chemokine signalling affects Typp cell contacts with leptomeningeal 
phagocytes but not Typp cell velocity and meandering index. 30 min time- 
lapse recordings during the early (upper plots) and established phases 
(lower plots) of leptomeningeal Typp cell infiltration. Bar plots show 
mean contact duration and contact frequencies (early and established 
phases: 968 and 498 motile cells, respectively) between Tpp_crp cells and 
Texas-Red-labelled meningeal phagocytes before (control) and 4 h after 
the indicated i.t. treatments. Data are mean +s.e.m. of 3 independent 
experiments. Dot plots show Typp-cep cell velocity and meandering index 
before (control) and 4 h after the indicated i.t. treatment. Data are mean 
values of individual T cells from at least 3 independent experiments per 
treatment and time point (early phase 1506 cells and established phase 
860 cells; Kruskal-Wallis ANOVA followed by Dunn’s multiple comparison 
test). g, Ga; blockade does not change the migratory pattern of Typ 
cells. Motility pattern of Tpp_crp cells in the leptomeninges during 

the established phase of leptomeningeal Typp cell infiltration before 

and 4h after PTX treatment (as in Fig. 3a,b). Directional analysis and 
probability plots of T cell displacement for the Lévy (red), Weibull (blue) 
and normal (Gauss, green) distributions. The Typp_cep cells maintained 
their Brownian walk after PTX treatment. Analysis of at least eight 
TPLSM recordings. h, CKCR3 blockade alone recapitulates the effect of 
Go, inhibition on Typp-_cep cells released from the meninges into the CSF. 
Representative TPLSM overviews of spinal cord leptomeninges before and 
8 h after i.v. injection of isotype control antibodies (control IgG) or 
anti-CXCR3 monoclonal antibody during established leptomeningeal 
Topp cell infiltration. Magnification of individual regions originating 
from the overviews illustrate the position of the effector T cells in the 
z-axis in the leptomeningeal space via a colour-code scale. Note, Tpp_crp 
cells are preferentially detached from the surface of the pia mater. Green, 
leptomeningeal blood vessels; red, phagocytes; blue, Typp-titeact-Turquoise2 
cells. Scale bars, 501m. *P < 0.05, ***P< 0.001 (e, f). 
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Extended Data Figure 9 | See next page for figure caption. 
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Extended Data Figure 9 | Motility pattern of brain-non-reactive Toya 
cells in the leptomeninges and their detachment into the CSF. 

a, b, The motility pattern of Tova cells in the leptomeninges resembles 
that of Typp cells. Tova-cep cell motility in the leptomeninges during 
established inflammation (TPLSM 3.5 days after co-transfer with 
non-labelled Typ cells). a, Tova cells do not follow a preferential 
direction. Directional analyses of the Toyva-crp cell tracks (as in Fig. 3a). 
Directions of in-plane movement (left) and associated probability 

for an angle varying within [0; 360] degrees in the fitting plane (right). 

b, Tova cells move in a Brownian random walk. Motility pattern of 
Tova-crp Cells (as in Fig. 3b). Analysis of at least 6 TPLSM recordings. 

c, Tova-crp Cells are less adhesive to meningeal structures in comparison 
to their myelin-reactive counterparts. Mean contact duration 

(417 contacts per 127 cells); contact frequencies (127 cells) between 
MBP- or OVA-reactive T cells and meningeal phagocytes; mean velocities 
(304 cells) and meandering indices (236 cells) of MBP- or OVA-reactive 
T cells; percentage of time spent by these T cells in contact with meningeal 
phagocytes during the 30 min recording. Results from intravital TPLSM 
recordings during established meningeal inflammation on spinal cord 
leptomeninges, after transfer of Tpp_crp cells or of Tova-crp cells 
together with non-labelled Tgp cells. Data are mean values + s.e.m. 

of 3 independent experiments per transfer (two-tailed Mann-Whitney 
U-test). d, Chemokine receptors and integrin expression in Tova_crp cells. 
Tova-crp Cells transferred into Lewis rats together with non-labelled Typp 


cells were sorted from meninges and CNS parenchyma during established 
meningeal inflammation. Quantitative PCR. House-keeping gene, 
B-actin (Actb). Representative data of 3 independent experiments. Data 
are mean + s.d. of duplicate measurements. e, Integrin or Ga; signalling 
interference induces a release of Toya_crp cells from the leptomeninges 
into the CSE. Tova-crp cells were co-injected with non-labelled Typp cells. 
During established meningeal inflammation (3.5 days after transfer) 
animals were treated i.t. with PBS (control), anti-LFA-1/VLA-4 
monoclonal antibodies (left) or PTX (right). Absolute numbers of Tova-crp 
cells in CSF, spinal cord meninges and parenchyma 4h after treatment. 
Data are mean +s.e.m. of representative data from at least 3 independent 
experiments per treatment including 3 animals per group (two-tailed 
Mann-Whitney U-test). f-h, Ga; signalling interference does not change 
the motility pattern of Toya cells. f, g, The Brownian walk of Tova cells 
was not changed after PTX treatment. Directional analyses and motility 
pattern of Toya_cerp cells 4 h after it. injection of PTX. Mathematical 
analyses of at least 6 TPLSM recordings (as in Fig. 3a, b) 4 days after co- 
transfer with Typp cells. h, Interference with chemokine signalling does 
not affect Tova cell straightness and velocity. Mean velocities (498 cells) 
and meandering indices (428 cells) of Tova-crp cells at the depicted time 
points of leptomeningeal T-cell inflammation. 30 min TPLSM recordings 
were acquired before and 4h after i.t. treatment with PTX. *P < 0.05, 
**P < 0.01, ***P< 0.001 (c-e). 
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Extended Data Figure 10 | Reattachment of Tgp cells from the CSF 

to the leptomeninges and its regulation by integrins, chemokines 

and T-cell activation/CNS inflammation. a, Distribution of reattached 
Tmpp Cells resembles that of early EAE lesions. T\rgp_cpp cells are located 
diffusely in the meninges and the adjacent spinal cord parenchyma after 
reattachment from the CSF. Tyypp-_crp cells retrieved from CNS tissue on 
day 3 after transfer were injected i-t. into naive animals. Fluorescence 
microscopy images of the fixed cervical spinal cord 20h later. Insets, 
magnifications of areas of interest (right; I, II). Arrows, T\gp_cep Cells 

in the leptomeninges (white) and CNS parenchyma (yellow). Green, 
Tmpp_cep Cells. Scale bars, 100 1m. b, Increased Tyypp cell rolling and 
floating after forced ventilation indicate that respiration is a major driving 
force for the spinal cord CSF flow. Time-lapse video microscopy of 
Tmpp-cep Cells during the established phase of leptomeningeal Typp cell 
infiltration performed under 3 different conditions: (1) during standard 
conditions (control; respiration rate: 81 bpm; cardiac rate: 230 bpm); 

(2) during hyperventilation (hypervent; respiration rate: 100 bpm; cardiac 
rate: 200 bpm); and (3) during hyperventilation following administration 
of methylscopolamine (0.05 mg kg’) to block the hyperventilation- 
induced vagal influence on the heart (Scopolamine/hypervent; respiration 
rate: 100 bpm; cardiac rate: 230 bpm). Hyperventilation induced a strong 
increase in rolling/floating Tpp_crp cells that was not changed after 
methylscopolamine. 185 s time projections. Green, Tpp_cep cells; white 
arrows, representative T cells rolling/floating in the CSF. Scale bars, 501m. 
Representative data of 3 independent experiments. c, Typp cell transport 
in the CSF after it. injection. Distribution of Tpp_crp cells in different 
levels of the spinal cord tissues after localized injection of the cells into 
the cisterna magna or the subarachnoidal space of the lumbar spinal 
cord. Flow cytometry analyses. Relative numbers of Typp_crp cells 6 and 
24h after i.t. transfer in the indicated CNS compartments. Combined 
results of 2 independent experiments for each it. injected site (n = 14). 

d, Twp cell infiltration in the different levels of the spinal cord during 
transfer EAE. Quantification (flow cytometry) of Typp_cep cells in the 
meninges or parenchyma of the indicated parts of the spinal cord. Data are 
mean + s.e.m. of representative data from 2 independent experiments per 
treatment including 2 or 3 animals per group (n= 41). e, Activated but not 
resting Typp cells induce EAE after it. transfer. Typp_crp Cells activated 
by antigenic stimulation or resting cells were i.t. transferred into naive 
animals. Clinical scores at the indicated time points. Representative 
results of 3 independent experiments with 4-6 animals per group. 

Data are mean +s.d. (n= 28). f, Interference with integrin binding 
reduces the encephalitogenic potential of it. transferred Typp-crp cells. 
Tmpp-crp Cell blasts were pre-treated with either an IgG control antibody 
or a combination of anti-LFA-1/VLA-4 monoclonal antibodies before 

it. transfer into naive animals. Clinical scores. Representative results 

of 3 independent experiments with 2 or 3 animals per group. Data are 
mean + s.e.m. (n = 16, two-tailed Mann-Whitney U-test). g, Tupp cells 
invade the inflamed CNS tissue more efficiently than Tova cells. Left, 

the same number of non-activated Typp_crp and Tova-cherry cells were 
co-injected i.t. into animals at the onset of clinical EAE, that is, 3 days 
after iv. transfer with unlabelled Typp_cep cells. 20h after i-t. injection, 
the entry of Tpp_GEP and Tova-Cherry cells into the CNS tissue 


(including spinal cord meninges and parenchyma) was quantified by 

flow cytometry. Representative results of 2 independent experiments with 
4 animals per group. Data are mean +s.d. (n = 8). Right, resting or in vitro 
activated Tova_crp Cells were i.t. injected into naive animals. In 

addition, a group of animals receiving resting Tova_crp cells were 
co-injected intrathecally with 25 j1g of OVA antigen. 20h after i.t. injection, 
Tova-cep Cells in the spinal cord tissue were quantified. Representative of 
2 independent experiments with 2 animals per group. Data are mean + s.d. 
(n= 12, two-tailed Mann-Whitney U-test (left) or Kruskal-Wallis 
ANOVA followed by Dunn's multiple comparison test (right)). 

h, Inflammation of the CNS tissue increases Tyypp cell entry from the CSF. 
CSF-derived Tupp_cep cells were injected i.t. either into naive animals or 
into animals at the onset of clinical EAE, that is, 3 days after i.v. transfer of 
non-labelled Tyypp cells (that is, inflamed condition). Cell quantifications 
of Tupp-crp Cells in the CNS tissue and CSF by flow cytometry 20h later 
when the re-transferred T cells had maximally infiltrated the inflamed 
CNS of the recipient animals. Data are mean + s.d. of representative results 
of 3 independent experiments (n = 12, two-tailed Mann-Whitney 

U-test). i, Blocking of integrin or Gaj-signalling reduces Typp cell 
migration from the CSF into the inflamed CNS tissue. CNS-derived 
Twupp-crp Cells were pre-treated with anti- VLA-4 or anti-LFA-1 monoclonal 
antibodies or with isotype control IgG antibody (control) (left) or PTX 

or PBS as control (right). Cells were then injected i.t. in animals at the 
onset of EAE and quantified as in h. Data are mean +s.d. of representative 
results of 3 independent experiments (n = 18, Kruskal-Wallis ANOVA 
followed by Dunn’s multiple comparison test (left) or two-tailed Mann- 
Whitney U-test, right). j-1, In active EAE induced by reactivated memory 
Twp cells , trafficking of the memory Typp cells between the distinct 

CNS compartments follows the same rules as the trafficking of Typp cells 
during transfer EAE. j, Memory Typp cells accumulate simultaneously 

in the meninges and the CSF before they occur in the choroid plexus. 
Quantification (flow cytometry) of GFP * Tapp-Memory Cells in CSF, 

spinal cord meninges and parenchyma (left), or in CSF and choroid 
plexus of the fourth ventricle (right) at the indicated time points after 
immunization of 10-week-old memory animals. Data are mean + s.d. of 

3 independent experiments (n = 13). k, CSF-derived memory Typp cells 
present a low activation profile. IFNy and IL-17 expression (quantitative 
PCR) in GFP* memory Typp cells 5 days after immunization. House- 
keeping gene, 3-actin (Actb). Representative data of 3 independent 
experiments. Data are mean + s.d. of duplicate measurements. I, Integrin 
and chemokine blockade reduces the number of Typp-Memory Cells in the 
leptomeninges during active EAE. Intravital TPLSM recordings of spinal 
cord leptomeninges from 10-week-old memory animals 5 days after 
immunization. Quantification of GFP * memory Typp cells in the acquired 
images before (control) and 4 h after i.t. treatment of either anti-LFA-1/ 
VLA-4 or anti-CXCR3 blocking monoclonal antibodies. Data are mean 
values + s.e.m. from 3 independent experiments including 5,947 cells 
(two-tailed Mann-Whitney U-test). *P < 0.05, **P< 0.01, ***P< 0.001 
(g-i, 1). m, Schematic representation of the migratory behaviour of effector 
T cells in the leptomeningeal milieu and the roles of chemokines, integrins 
and activation in controlling the T-cell migration steps. Question marks 
indicate unresolved points. 
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NEK7 is an essential mediator of NLRP3 activation 
downstream of potassium efflux 


Yuan He!, Melody Y. Zeng!, Dahai Yang!?, Benny Motro? & Gabriel Nuifiez! 


Inflammasomes are intracellular protein complexes that drive the 
activation of inflammatory caspases’. So far, four inflammasomes 
involving NLRP1, NLRP3, NLRC4 and AIM2 have been described 
that recruit the common adaptor protein ASC to activate caspase-1, 
leading to the secretion of mature IL-16 and IL-18 proteins. The 
NLRP3 inflammasome has been implicated in the pathogenesis 
of several acquired inflammatory diseases*® as well as cryopyrin- 
associated periodic fever syndromes (CAPS) caused by inherited 
NLRP3 mutations®’. Potassium efflux is a common step that is 
essential for NLRP3 inflammasome activation induced by many 
stimuli®?. Despite extensive investigation, the molecular mechanism 
leading to NLRP3 activation in response to potassium efflux remains 
unknown. Here we report the identification of NEK7, a member of 
the family of mammalian NIMA-related kinases (NEK proteins)!°, as 
an NLRP3-binding protein that acts downstream of potassium efflux 
to regulate NLRP3 oligomerization and activation. In the absence 
of NEK7, caspase-1 activation and IL-16 release were abrogated in 
response to signals that activate NLRP3, but not NLRC4 or AIM2 
inflammasomes. NLRP3-activating stimuli promoted the NLRP3- 
NEK7 interaction in a process that was dependent on potassium 
efflux. NLRP3 associated with the catalytic domain of NEK7, but the 
catalytic activity of NEK7 was shown to be dispensable for activation 
of the NLRP3 inflammasome. Activated macrophages formed a 
high-molecular-mass NLRP3-NEK7 complex, which, along with 
ASC oligomerization and ASC speck formation, was abrogated 
in the absence of NEK7. NEK7 was required for macrophages 
containing the CAPS-associated NLRP3(R258W) activating 
mutation to activate caspase-1. Mouse chimaeras reconstituted with 
wild-type, Nek7~/~ or Nirp3~/— haematopoietic cells showed that 
NEK7 was required for NLRP3 inflammasome activation in vivo. 
These studies demonstrate that NEK7 is an essential protein that acts 
downstream of potassium efflux to mediate NLRP3 inflammasome 
assembly and activation. 

To understand the signalling mechanism of NLRP3 inflammasome 
activation, we sought to identify proteins that interact with NLRP3 
after inflammasome activation. To purify NLRP3 protein com- 
plexes, we generated a triple-tagged NLRP3 (NLRP3-SFP) fused 
with three tags in the carboxyl terminus: S-tag, Flag (for detection), 
and a streptavidin-binding tag. Reconstitution of Nirp3~’~ immortal- 
ized bone-marrow-derived macrophages (iBMDMs) with NLRP3- 
SFP restored ATP-induced caspase-1 activation and IL-1{ release 
(Extended Data Fig. 1a). We treated lipopolysaccharide (LPS)-primed 
reconstituted iBMDMs with ATP to induce NLRP3 activation, and 
searched for interacting partners of NLRP3 using liquid chromato- 
graphy—mass spectrometry. The analysis revealed NEK7 as a major 
interacting partner of NLRP3 (Fig. 1a). NLRP3 did not associate with 
NEK6, a NEK7-related paralogue, or NEK9, another member of the 
NEK family'® (Extended Data Fig. 1b). The NLRP3-NEK7 inter- 
action was confirmed by pull-down assays using streptavidin beads 
or immunoprecipitation (Fig. 1b, c and Extended Data Fig. 1b-d). 


Notably, the NLRP3-NEK7 interaction was slightly increased by LPS 
priming, but was clearly enhanced after ATP stimulation (Fig. la—c 
and Extended Data Fig. 1b-d). The interaction of NLRP3 with NEK7 
was independent of ASC, caspase-1 or caspase-11 (Extended Data 
Fig. 1d). To determine the regions within NLRP3 that associate with 


a 
NLRP3 complex (LPS) NLRP3 complex (LPS + ATP) 
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Figure 1 | NEK7 interacts with NLRP3. a, Mass spectrometry analysis 
of NLRP3 and NEK7 peptides after purification of NLRP3-associated 
proteins. b, NLRP3-SFP was pulled down and immunoblotted with 
indicated antibodies. Strep, streptavidin. c, LPS-primed BMDMs 

were left unstimulated or stimulated with ATP for 30 min. Cell lysates 
were immunoprecipitated (IP) and immunoblotted (IB) with 
indicated antibodies. d, e, Wild-type or mutant NLRP3 (Apyrin 

or ALRR (leucine-rich repeats)) was expressed in HEK293T cells, 
immunoprecipitated and analysed by immunoblotting. f, Wild-type or 
mutant NEK7 was co-expressed with Flag-tagged NLRP3 in HEK293T 
cells, immunoprecipitated and analysed by immunoblotting. Whole-cell 
lysates are shown as the input. aa, amino acids; HA, haemagglutinin. 
Results are representative of at least three independent experiments. 
See Supplementary Fig. 1 for gel source data. 
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Figure 2 | NEK7 deficiency specifically abrogates the activation of the 
NLRP3 inflammasome. a, BMDMs were left untreated or stimulated with 
LPS, and cell lysates were immunoblotted with indicated antibodies. 

b, d, Caspase-1 (CASP1) in the supernatant (sup.) and cell lysate (lys.) 
was analysed in stimulated Nek7*/* and Nek7~/~ macrophages treated 
with ATP, nigericin, gramicidin, poly(dA:dT) or Salmonella (b) or with 
LLOMe, silica, Alum, MSU, calcium pyrophosphate dihydrate (CPPD) or 
nano-SiO> (d). Mock represents macrophages primed with LPS without 
further stimulation. c, e, IL-18 release was measured. f, g, CRISPR- 
Cas9-generated NEK7-deficient (NEK7 knockout (KO)) iB MDMs were 
transduced with control or pHIV-NEK7 lentivirus, and stimulated with 
LPS plus nigericin. Caspase-1 activation (f) and IL-1{ release (g) were 
analysed. h, i, Macrophages were transduced with control lentivirus or 
shRNA lentiviruses targeting NEK7, and stimulated with LPS. Caspase-1 
activation (h) and IL-1 release (i) were analysed. IL-18 release data 

(c, e, g, i) are expressed as mean values. Error bars denote s.d. of triplicate 
wells. Results are representative of three independent experiments. 

See Supplementary Fig. 1 for gel source data. 


NEK7, we expressed Flag-tagged wild-type or mutant NLRP3 in 
HEK293T cells. NEK7 interacted with wild-type or mutant NLRP3 lack- 
ing the amino-terminal pyrin domain, but not with NLRP3 lacking the 
carboxy-terminal leucine-rich repeats (Fig. 1d). Furthermore, NEK7 
did not associate with the singly expressed pyrin domain, the centrally 
located nucleotide-binding domain (NOD) or leucine-rich repeats (Fig. 
le), indicating that both the NOD and leucine-rich repeats are involved 
in the interaction with NEK7. Conversely, the catalytic domain of NEK7, 
but not the N-terminal 33-amino-acid extension domain, interacted 
with NLRP3 (Fig. 1f). Further analysis showed that the N-terminal 
region (amino acid residues 34-212), but not the C-terminal region (res- 
idues 213-302), of the NEK7 catalytic domain!! mediates the interaction 
with NLRP3 (Fig. 1f). However, mutations of the catalytic Lys63 and 
Lys64 or the Gly43 amino acid residue that abolish the catalytic activity 
of NEK7 (refs 12, 13) did not impair the NLRP3-NEK7 interaction 
(Fig. 1f). These results indicate that NEK7 interacts with NLRP3 and 
the interaction is enhanced in response to NLRP3-activating stimuli. 
We next evaluated the requirement for NEK7 in NLRP3 inflammas- 
ome activation. Because NEK7 deficiency leads to either embryonic 
lethality or death of pups soon after birth'*, we generated mouse chi- 
maeras after transplanting fetal liver cells from Nek7*!* or Nek7~/~ 
embryos into lethally irradiated recipient mice. BMDMs from mice 
reconstituted with Nek7~/~ cells lacked detectable expression of 
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NEK7, but expressed normal amounts of NLRP3, caspase-1 and ASC 
(Fig. 2a). Importantly, activation of caspase-1 and IL-1 release 
induced by ATP, nigericin and toxin gramicidin, three stimuli that 
activate NLRP3, were abolished in Nek7~/- BMDMs (Fig. 2b, c). By 
contrast, activation of caspase-1 and IL-1( release in response to poly 
(deoxyadenylic-deoxythymidylic) acid (poly(dA:dT)) that activates the 
AIM2 inflammasome, or Salmonella enterica serovar Typhimurium 
(Salmonella) that activates the NLRC4 inflammasome, were not 
affected in Nek7-’- BMDMs (Fig. 2b, c). Likewise, caspase-1 activa- 
tion and IL-16 release induced by particulate matter and the lysosome 
membrane damaging agent Leu-Leu-OMe (LLOMe), were impaired 
in Nek7~/" BMDMs (Fig. 2d, e). By contrast, TNF-« release induced 
by all tested stimuli was unaffected in Nek7~/~ BMDMs (Extended 
Data Fig. 2a, b). In addition, NLRP3-dependent caspase-1 activation 
and IL-1( release induced by cytosolic LPS stimulation that activates 
the non-canonical inflammasome via caspase-11 also required NEK7 
(Extended Data Fig. 2c, d). Consistent with previous studies!>~'’, cyto- 
toxicity induced by cytosolic LPS required caspase 11, but not NLRP3 
or NEK7 (Extended Data Fig. 2e). To ensure that impaired NLRP3 
activation in Nek7~/~ BMDMs was not secondary to abnormal mouse 
development, we deleted Nek7 using CRISPR-Cas9 genome editing in 
iBMDMs. NLRP3 inflammasome activation induced by nigericin was 
abrogated in NEK7-deficient macrophages (Fig. 2f, g and Extended 
Data Fig. 3a—c). Importantly, re-expression of NEK7 in NEK7-deficient 
macrophages restored NLRP3 inflammasome activation (Fig. 2f, g). 
Similarly, knockdown of NEK7 by short hairpin RNAs (shRNAs) tar- 
geting NEK7 impaired caspase-1 activation and IL-1( release, but not 
TNF-a production, in response to ATP, nigericin or silica (Extended 
Data Fig. 4a—d). We also depleted NEK7 in BMDMs containing the 
activating NLRP3(R258W) mutation corresponding to the human 
NLRP3(R260W) mutation that causes Muckle—Wells syndrome’®. 
In agreement with previous studies!”, treatment of NLRP3(R258W) 
BMDMs with LPS alone was sufficient to activate caspase-1 and IL-18 
release (Fig. 2h, i). Notably, caspase-1 activation and IL-1 release 
elicited by LPS in NLRP3(R258W) BMDMs were impaired by NEK7 
knockdown (Fig. 2h, i). These results indicate that NEK7 acts on or 
just downstream of both wild-type and CAPS-associated NLRP3 to 
regulate the inflammasome. 

Stimulation of Nek7*/+ BMDMs with the NLRP3 activators ATP 
and nigericin, as well as poly(dA:dT) or Salmonella, induced rapid 
formation of large intracellular ASC aggregates called ASC specks 
in the cytosol (Fig. 3a, b). The formation of ASC specks induced 
by ATP or nigericin was abrogated, but unperturbed when induced 
by poly(dA:dT) or Salmonella, in Nek7~/~ BMDMs (Fig. 3a, b). 
Consistently, ASC oligomerization triggered by stimulation with sev- 
eral NLRP3 activators, but not poly (dA:dT) or Salmonella infection, 
was abolished in Nek7-/~ BMDMs or greatly reduced in BMDMs 
with knockdown of NEK7 (Fig. 3c and Extended Data Fig. 5a-c). 
Activated inflammasomes assemble into high-molecular-mass 
multiprotein complexes”. To assess NLRP3 inflammasome assembly, 
wild-type and Nek7~/~ BMDMs were stimulated with nigericin or 
ATP, and digitonin-solubilized cell lysates were resolved by blue native 
polyacrylamide gel electrophoresis (PAGE), and the blots were then 
immunoblotted with anti- NLRP3 and anti- NEK7 antibodies. A large 
oligomeric complex (>1,000 kilodaltons (kDa)) containing NLRP3 
and NEK7 was induced in wild-type BMDMs after stimulation with 
ATP or nigericin, and this was greatly reduced or absent in stimulated 
Nek7~/~ and Nirp3~/~ cells (Fig. 3d). To resolve the formation of 
NLRP3 oligomers better, we separated the samples in the first dimen- 
sion by blue native PAGE and then in a second dimension by SDS- 
PAGE. Immunoblotting revealed that NEK7 was indeed present in a 
high-molecular-mass NLRP3 complex induced by ATP or nigericin 
in primary wild-type BMDMs, which was absent in unstimulated 
wild-type cells and stimulated Nek7~/~ or Nirp3~/~ cells (Fig. 3e). 
To determine whether the kinase activity of NEK7 is important for 
the regulation of the NLRP3 inflammasome, we expressed wild-type 


18 FEBRUARY 2016 | VOL 530 | NATURE | 355 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


@ = Mock Nigericin  Poly(dA:dT) Salmonella Dy ea Nek7"/* Figure 3 | NEK7 is required for NLRP3_ 
= = Nek7- oligomerization and ASC speck formation 
. "4 x downstream of potassium efflux. a, b, 
8 5 Representative immunofluorescence images 
9 7 and quantification of endogenous ASC 
= specks (arrows). ND, not detected. Data show 
= ui representative results from three combined 
K€ K‘ g | independent experiments. Scale bars, 10|1m. 
8 Error bars indicate s.d. c, ASC oligomerization 
2 F 9/19/19 induced by indicated stimuli in Nek7*/+ and 
Es Es < a 2 : Nek7~’~ LPS-primed macrophages. Mock 
«BS 5S oS SRS Sv stl ta nochaa yrigred tenet a 
c ete ote Ho Oo WW PE er >“ denotes stimulation with PBS. TX, Triton X-100. 
x EU8x~ EBS gO dS q06S5-5¢ SF ; F 
eae a See ga gak oo cosoeogsa§ os d, Indicated LPS-primed macrophages were 
i PESOS te BOO OS AD Of ASCs d Mock Nigericin ATP stimulated with PBS (mock), ATP or nigericin. 
oie: i Pe ie Analyses by blue native PAGE or SDS-PAGE, 
2s ¥ oligomer + * 
3 | %: | rahe nee th 4% Lh and immunoblotting. WB, western blot. e, Cell 
8| 37 —8 gnme ~ dimer SSes Sess 1 d by a first di i f bl 
§ kDa 5 RES 58 § E8s lysates were separated by a first dimension of blue 
es 3) “~ - - a ties 1.236 - ~ _ <Oligomers native PAGE followed by a second dimension of 
2 S| 19 eee eer eemne —— Gee gt ese ASC fay "720 - r na SDS-PAGE. NS, nonspecific band. f, g, NEK7- 
2 2 | 37 a n TE watares Actin = 342 | i y ” i al NLRP3 interactions and complex formation in 
> f i ; 
a Nek7** Nek7/-_Nek7"* Nek7~ Nek7" Nek7-- g ore = a the presence of 5mM or 50 mM extracellular 
5 ; & | 1,236 - - ~< Oligomers KCl. h, Proposed model for mechanism of NEK7- 
e 4-12% 1D native PAGE se | '720- a —— | i NS i ; ivati 
SY] 7205 mediated NLRP3 inflammasome activation. 
we kDa 1,236 140 201,286 140 201,296 140 20 ca 5 8 Tse ulis age axaiesenennn tar isaarte 
g 51110 ee fe=!WB:NLRP3 4 | 242 Monae <i NS Resu ay presentative ast three 
| | 87- = ae ead WB: NEK7 w oe ad WB: NEK7 independent experiments. See Supplementary 
a|§s : x 116+ WB: NLRP3 Fig. 2 for gel source data. 
8 g| a —~ a WB:NLAPS 3) | |, a ee SS NEKy 8 8 
o ¥ * : - : 
a 2 77 - = SSB ENEKT FQ! 57 tise med WB: Actin 
S10 16 - = —— WB:NLRP3) @? 
~ 37 - —" — WB: ATP 
ye *NS . WT . Nek7~ Nips wee Bars Pore-forming toxins 
f Particular matter 
KCl(mM) 5 5 50 4-12% 1D native PAGE ie ai 
ATP ~ + + w kDa 4, 236 140 20 
oe NEK7 Q 5 1 | 
IP: NLURP3] 2 NILRP3 a| ¥ LC WB: NLRP3 
= NLRP3 NEK7 
= SS NAPS Q) =| 37 ap WB: NEK7 R58) NLRP3 NEK7 
ow ewes NEK7 Hw 
Input epee CASH pas S O]116 ! ee) WB: NLRP3 
— g) S| 374 WB: NEK7 ivati i 
nN 
<= — CASP1 p20 1 E a) Activation of the NLRP3 inflammasome 
wo 


or mutant NEK7 with mutations at the critical catalytic lysine resi- 
dues Lys63 and Lys64 or glycine residue Gly43 in Nek7~’~ iBMDMs 
by lentivirus-mediated transduction. All of these NEK7 mutations 
have been found to abolish the kinase activity of NEK7 (refs 12, 13). 
Importantly, the ability of nigericin to induce caspase-1 activation and 
IL-1 release was not impaired in Nek7~/~ iBMDMs reconstituted 
with these NEK7 mutant proteins (Extended Data Fig. 6a-c). In addi- 
tion, shRNA depletion of NEK9, the kinase that phosphorylates and 
activates NEK7 (refs 11, 21), or of NEK7-related kinase NEK6 did not 
impair the activation of the NLRP3 inflammasome (Extended Data 
Fig. 7a-e). NEK7 regulates microtubule dynamics”””*. However, treat- 
ment with nocodazole or colchicine, two inhibitors of tubulin polymer- 
ization, did not inhibit NLRP3 inflammasome activation and hada 
marginal inhibitory effect on IL-18 and TNF-a release (Extended 
Data Fig. 8a—e). Furthermore, inhibition of tubulin polymerization 
did not affect the NLRP3-NEK7 interaction (Extended Data Fig. 8f). 
Inhibition of Bruton’s tyrosine kinase (BTK) that has been implicated 
in NLRP3 inflammasome activation” did not disrupt the interaction 
between NEK7 and NLRP3 (Extended Data Fig. 9a, b). Additionally, 
we could not detect BTK in NLRP3 complexes before or after 
ATP stimulation (Extended Data Fig. 9b). Stimulation of Nek7 +h 
and Nek7-/~ BMDMs with ATP, nigericin or gramicidin induced 
comparable levels of potassium efflux (Extended Data Fig. 10). 
Notably, the association of NLRP3 with NEK7 as well as caspase-1 
activation triggered by ATP were blocked in the presence of 50 mM 
KC] that inhibits potassium efflux and NLRP3 activation (Fig. 3f). 
Importantly, the induced high-molecular-mass NLRP3 complex 
containing NEK7 was not seen in BMDMs cultured in 50 mM KCl 
(Fig. 3g). These results indicate that potassium efflux promotes the 
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NLRP3-NEK7 interaction that is required for NLRP3 inflammasome 
assembly and caspase-1 activation (Fig. 3h). 

We assessed the role of NEK7 in the regulation of the inflammasome 
in vivo. We generated mouse chimaeras after transplanting fetal liver 
cells from wild-type, Nek7~’~ or Nirp3~/~ embryos into lethally irra- 
diated recipient mice, and IL-18 production induced by LPS challenge 
was assessed in the serum”>”®. Administration of LPS to chimaeric 
mice reconstituted with wild-type fetal liver cells induced IL-16 pro- 
duction, which was diminished similarly in chimaeric mice recon- 
stituted with Nek7~/~ or Nirp3~/~ haematopoietic cells (Fig. 4a). 
Consistent with additional NLRP3 function in radioresistant recipient 
cells”, chimaeric Nirp3’~ recipients transplanted with Nek7~’~ or 
Nirp3~‘~ fetal liver cells showed further reductions in IL-18 production 
(Fig. 4a). Notably, production of IL-6 and TNF-a in the sera of all chi- 
maeric mice was comparable (Fig. 4b, c). Likewise, production of IL-16 
induced by intraperitoneal administration of monosodium urate (MSU) 
crystals, which is also largely mediated by the NLRP3 inflammasome”, 
was reduced at comparable levels in chimaeric mice reconstituted with 
Nek7~‘~ or Nirp3~’~ fetal liver cells (Fig. 4d). These studies indicate that 
NEK7 is required for activation of the NLRP3 inflammasome in vivo. 

We have shown that NEK7 is an essential factor that specifically and 
non-redundantly functions downstream of potassium efflux to reg- 
ulate the activation of the NLRP3 inflammasome. Our results are in 
agreement with recent studies that identified NEK7 as a critical regu- 
lator of the NLRP3 inflammasome””””. Our studies suggest a model in 
which potassium efflux induced by NLRP3-activating stimuli triggers 
the association of NLRP3 with NEK7, leading to the assembly and 
activation of the NLRP3 inflammasome (Fig. 3h). Macrophages con- 
taining the CAPS-associated NLRP3(R258W) activating mutation that 
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Figure 4 | NEK7 is required for activation of the NLRP3 inflammasome 
in vivo. a—c, Mouse serum cytokines IL-1 (a), IL-6 (b) and TNF-a (ce) 
were analysed after intraperitoneal injection of LPS. d, IL-18 was 
analysed in peritoneal lavage fluids after intraperitoneal injection of 
MSU. Each symbol represents one mouse. Mean values are indicated by 

a horizontal bar. In the absence of LPS or MSU stimulation, the amounts 
of IL-1 were undetectable. Results are representative of two independent 
experiments. NS, not significant. *P = 0.05; ** P< 0.01; ***P < 0.001 
(Kruskal-Wallis test). 


does not require potassium efflux for inflammasome activation’? also 
required NEK7 for caspase-1 activation, suggesting that this mutant 
NLRP3 may be competent for NEK7 association in the absence of 
potassium efflux. NEK7 regulates microtubule dynamic instability and 
spindle assembly which required the NEK7 catalytic activity!*!>???, 
By contrast, the catalytic activity of NEK7 is not required for NLRP3 
activation. Further work is needed to understand the dual functions 
of NEK7. Taken together, our studies suggest that NEK7 could be a 
potential target of therapeutics to treat inflammatory diseases linked 
to NLRP3 inflammasome activation. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 

Mice. Nek7*/~, Nirp3~/~, Asc~/~ (also known as Pycard~/~) Casp1~/~ Casp11~/-, 
Casp11~/~ (also known as Casp4~’~) mice on C57BL/6 background have been 
reported'*3!-33, NLRP3(R258W) mice were originally provided by Warren Strober 
(NIH). C57BL/6 mice were originally purchased from Jackson Laboratories and 
maintained in our facility. All animal studies were approved by the University of 
Michigan Committee on Use and Care of Animals. 

Reagents. High-capacity streptavidin agarose resin was from Thermo Scientific 
(20359). S-protein agarose beads were from Novagen (69704). Biotin (B4501), 
nocodazole (M1404) and colchicine (C3915) were purchased from Sigma. LFM- 
A13 (1300) was purchased from Tocris. DOTAP liposomal transfection reagent was 
from Roche (11202375001). CytoTox 96 Non-Radioactive Cytotoxicity Assay Kit 
was purchased from Promega (G1780). LPS-B5 Ultrapure (tlrl-pb5lps), LPS-SM 
Ultrapure (tlrl-smlps), MSU (tlrl-msu), CPPD (tlrl-cppd), Nano-SiO, (tlrl-sio) 
and poly(dA:dT)/lyovec (tlrl-patc) were purchased from InvivoGen. Alum (77161) 
was from Thermo Fisher Scientific. ATP (A2383) was from Sigma. Nigericin 
(481990) was purchased from EMD Millipore. Silica (Min-U-Sil 5) was from US 
Silica. LLOMe was purchased from Chem-Impex. Salmonella enterica serovar 
Typhimurium strain $L1344 was a gift from D. Monack. Human NEK7 cDNA 
clone was purchased from the Harvard DNA Resource Core (HsCD00082815). 
Mouse Nek7 cDNA clone was from Thermo Scientific (MMM1013-202764053). 
Lentiviral expression vector pHIV-EGFP was from Addgene (21373). QuikChange 
II XL Site-Directed Mutagenesis Kit was from Agilent (20051). Antibodies for 
NEK7 (ab133514), NEK6 (ab133494) and NEK9 (ab138488) were purchased from 
Abcam. Mouse anti-NLRP3 (Cryo-2) was from AdipoGen2. Actin (A00730-200), 
HA tag (A01244-100) and Flag (DYKDDDDK) tag (A00187-200) antibodies were 
purchased from GenScript. Anti-a/® tubulin and anti-BTK antibodies were from 
Cell Signaling (2148, 3533). ASC antibody and caspase-1 antibody for the cleaved 
p20 of caspase-1 were generated in our laboratory. 

Cell culture. BMDMs were generated by differentiating bone marrow progenitors 
from the tibia and femur for 7 days in 10% FCS IMDM (Gibco) supplemented with 
30% L-cell supernatant, non-essential amino acids, sodium pyruvate and antibiotics 
(penicillin/streptomycin). iBMDMs were generated as previously described**. 
iBMDMs were cultured in 10% FCS IMDM (Gibco) supplemented with non- 
essential amino acids, sodium pyruvate and antibiotics (penicillin/streptomycin). 
HEK293T cells (ATCC) were cultured on DMEM (Sigma) containing 10% FCS 
and antibiotics (penicillin/streptomycin). HEK293T cells were not authenticated 
or tested for mycoplasma contamination. 

Purification of NLRP3-associated proteins. To search for interacting partners of 
NLRP3 in mouse macrophages, we generated a triple-tagged NLRP3 (NLRP3-SFP) 
that was fused with three tags in the C terminus: S-tag, Flag (for detection) and 
streptavidin-binding tag. This tagged NLRP3 can be used to purify NLRP3 protein 
without antibody-based immunoprecipitation. NLRP3-SFP was cloned into the 
pHIV-EGFP lentiviral expression vector (Addgene). Transduced immortalized 
Nirp3~/~ macrophage clones were sorted by flow cytometry using green fluorescent 
protein (GFP) as a marker and screened for NLRP3 expression by immunoblotting. 
Clones with NLRP3 expression levels comparable to those found in wild-type 
iBMDMs stimulated with LPS were selected for further analysis. Macrophages 
were plated in 10-cm Petri dishes and primed with 200ng ml! LPS for 4h before 
stimulation with 5mM ATP for 30 min. Cells were washed once with cold PBS 
after stimulation, and then lysed in ice-cold lysis buffer (50 mM Tris-HCl, pH 7.4, 
2mM EDTA, 150mM NaCl, 0.5% Nonidet P-40, 1x EDTA-free Roche protease 
inhibitor cocktail) for 10 min at 4°C. Cell lysates were collected and spun down at 
20,000g for 15 min at 4°C. The soluble fraction was incubated with streptavidin 
agarose beads for 2h at 4°C. The beads were washed three times with lysis buffer. 
Bound proteins were eluted with 2 mM biotin in PBS, and then re-captured by 
incubation with S-protein agarose beads. The NLRP3-associated proteins were 
finally eluted in SDS sample buffer (10% glycerol, 60 mM Tris-HCl, pH 6.8, 2% 
SDS, 0.02% bromophenol blue and 5% 3-mercaptoethanol), separated by 4-12% 
SDS-PAGE, followed by mass spectrometry. Nirp3~/~ macrophages reconstituted 
with empty vector (pHIV-EGFP) were used as a control. 

HEK293T cell transfection. HEK293T cells were plated into 6-well tissue cul- 
ture plates (6.25 x 10° cells per well in 2ml complete DMEM) overnight. Cells 
were transfected for 16h with triple Flag-tagged full-length NLRP3 (0.5 1g), pyrin 
domain deletion mutant (Apyrin; amino acids 94-1036, 0.5 1g), LRR deletion 
mutant (ALRR; amino acids 1-741, 11g), pyrin domain (amino acids 1-93, 0.5,1g), 
NOD domain (amino acids 220-536, 2.5 1g), LRR domain (amino acids 742-991, 
1,1g), or co-transfected Flag-tagged full-length NLRP3 (0.5 1g) with NEK7-3 x HA 
(0.5 1g) by Lipofectamine LTX (Invitrogen). 

Immunoprecipitation. Cells were lysed in ice-cold lysis buffer (50 mM Tris, 
pH 7.4, 2mM EDTA, 150mM NaCl, 0.5% Nonidet P-40, 1x EDTA-free Roche 
protease inhibitor cocktail). Cell lysates were clarified by centrifugation (14,000g) 


at 4°C for 10 min. Pre-cleared cell lysates were incubated with anti- NEK7 (1:100), 
anti- NLRP3 (1:100), anti-Flag (1:200), anti-HA (1:200) antibody or control IgG at 
4°C overnight. The proteins bound by antibody were pulled down by protein G 
beads and subjected to immunoblotting analysis. 

Inflammasome activation and cytotoxicity assay. Macrophages were plated in 
12-well plates at 1 x 10° cells per well. One day later, culture medium was replaced 
with 0.5 ml serum-free IMDM per well. Cells were then primed with 200ng ml! 
ultrapure LPS for 4h, followed by stimulation with PBS (mock), 5mM ATP 
(30min), 51M nigericin (1 h), 0.5,.M gramicidin (1h), poly(dA:dT) (21g ml}, 
4h), Salmonella (multiplicity of infection (m.o.i) = 10, 1h), LLOMe (21M, 6h), 
silica (500 1g ml“, 6h), Alum (250 1g ml~!, 6h), MSU (200g ml), CPPD 
(200 pg ml~!, 6h) or Nano-SIO> (200}1g ml}, 6h). After stimulation, superna- 
tants and cell lysates were collected separately, or combined together for immuno- 
blotting analysis. For non-canonical inflammasome activation, 1 j1g of LPS-SM 
was packaged with 151] of DOTAP according to manufacturer’s instructions. 
After 30 min incubation at room temperature, the mixtures were diluted into 1 ml 
Opti-MEM and added to each well of 6-well plate containing (LPS-B5)-primed 
macrophages (4h priming). Four hours later, culture supernatants and cell lysates 
were collected for immunoblotting, and 48-well plates were used for IL-1 release 
and cytotoxicity assay. 

Gene knockdown in primary macrophages. Lentiviral pLKO.1 plasmids 
targeting Nek7, Nek6 and Nek9 were purchased from Sigma. Nek7 shRNA1 
(TRCTRCN0000236005, targeting sequence: 5’-CCGGTGGAGTGCCGGTAG 
CGTTAAACTCGAGTT TAACGCTACCGGCACTCCATTTTTG-3’), Nek7 
shRNA2 (TRCN0000236007, 5’-CCGGGAGCTACGACAGCTAGTTAA 
TCTCGAGATTAACTAGCTGTCGTAGCTCTTTTTG-3’), Nek6 shRNA 
(TRCN0000274723, 5’-CCGGAGGACAGTTCAGTGAGGTTTACTCGAGT 
AAACCTCACTGAACTGTCCTTTTTTG-3’) and Nek9 (TRCN0000027597, 
5'-CCGGCGACAACATCATTGCCTACTACTCGAGTAGTAGGCAATGATGT 
TGTCGTTTTT-3’) were reported in this study. pLKO.1 scramble (Addgene 1864) 
was used as a negative control. Lentiviruses were packaged in HEK293FT cells and 
concentrated by Lenti-X concentrator (Clontech). Gene knockdown in primary 
macrophages was performed as described previously”°. Puromycin-selected mac- 
rophages were collected, counted and plated on day 8, and used for experiments 
on the next day. The knockdown efficiency was analysed by immunoblotting with 
antibodies against NEK7, NEK6 or NEK9. 

Targeting of NEK7 by CRISPR-Cas9-mediated genome editing. Lentiviral 
CRISPR-Cas9 targeting guide RNA (gRNA) expressing vector (lentiCRIS- 
PRv2) was obtained from Addgene (52961). The NEK7 knockout target 
sequence used was 5’-TGAAAAACCGACCAAGCCCA-3’. To generate a 
NEK7 knockout cell line, lentiCRISPRv2 stocks containing the target sequence 
5'-TGAAAAACCGACCAAGCCCA-3’ were used to transduce iBMDMs. 
Three days later, single clones of puromycin-resistant cells were sorted by flow 
cytometry and clones with NEK7-deficiency were identified by immunoblotting 
with anti- NEK7 antibody. 

Reconstitution of NEK7 in Nek7~/~ macrophages. Immortalized Nek7~/~ mac- 
rophages were transduced with lentivirus containing pHIV-NEK7-3 x HA. After 
3-4 days, transduced cells were sorted by flow cytometry using eGFP as a marker. 
Expression of reconstituted proteins was determined by immunoblotting. 
Generation of chimaeras from fetal liver cells. Fetal livers were collected at 
day 14 of gestation from wild-type, Nek7~’~ or Nirp3~/~ embryos. Wild-type and 
Nek7~/~ fetuses were genotyped with primers (wild-type, forward, 5’-AAGG 
CTTACTTGGTGACACACTGG-3’; reverse, 5’/-CACCGTGCAGGTGACTC 
GAACC-3’; Nek7~/~, forward, 5‘-CCCTGGCGTTACCCAACTTAATCGCCT 
TGC-3’, reverse, 5’-TTCTCCGTGGGAACAAACGGCGGATTGAC-3’), which 
yielded a 400-base-pair (bp) wild-type DNA band and a 302-bp Nek7 mutant DNA 
band after electrophoresis on 1.5% agarose gel in TBE buffer. Fetal livers were col- 
lected from fetuses and processed into single-cell suspensions using a 401m Nylon 
Cell Strainer (Falcon). Approximately 5 x 10° fetal liver cells in 10011 PBS were 
administered by retro-orbital injection into 6-week-old gender-matched wild-type 
or Nirp3~’~ recipient mice that had been lethally irradiated using X-ray (Phillips 
RT250, Kimtron Medical) with two doses of 5.40 Gy (total 10.8 Gy). Recipient mice 
were allocated randomly into experimental groups. Recipient animals received anti- 
biotics (neomycin and polymycin B) for 4weeks after reconstitution. Eight weeks 
later, successful generation of chimaeras was confirmed by PCR-based analysis 
and animals were analysed (see above). The number of animals per group (n= 5-8) 
was chosen as the minimum likely required for conclusions of biological signifi- 
cance, established from previous experience. 

ASC speck staining and ASC oligomer cross-linking. BMDMs were plated on 
an 8-well permanox chamber slide (Thermo Scientific, 177445) overnight. Cells 
were primed with 200 ng ml! LPS for 4h, then stimulated with ATP or nigericin, 
transfected with poly(dA-dT), or infected with Salmonella. After stimulation, cells 
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were fixed with 4% paraformaldehyde, permeabilized with 0.1% Triton X-100, and 
the slides were blocked with PBS buffer containing 3% BSA. Cells were stained with 
anti-ASC antibody and Alexa-Fluor-488-conjugated secondary antibody. DAPI 
was used to stain nuclei. Cell images were taken using an Olympus Fluo-View 500 
confocal microscope system. 

For ASC oligomer cross-linking, cells were plated on 12-well plates and stimu- 
lated as indicated. Cells were lysed with PBS buffer containing 0.5% Triton X-100, 
and the cell lysates were centrifuged at 6,797¢ for 15 min at 4°C. Supernatants were 
transferred to new tubes (Triton-X-soluble fractions). The Triton-X-100-insoluble 
pellets were washed with PBS twice and then suspended in 200 1l PBS. The pellets 
were then cross-linked at room temperature for 30 min by adding 2 mM bis[sul- 
fosuccinimidy]suberate (BS*). The cross-linked pellets were spun down at 6,797¢ 
for 15min and dissolved directly in SDS sample buffer. 

Blue native PAGE and 2D PAGE. Blue native gel electrophoresis was performed 
using the Bis-Tris Native PAGE system as previously described”. In brief, 
2 x 10° macrophages were re-plated in 6-well plates. On the second day, cells were 
primed with 200 ng ml! LPS for 4h, followed by stimulation with PBS (mock), 
5mM ATP (30 min) or 541M nigericin (1h). Macrophages were washed once with 
cold PBS and then lysed in ice-cold native lysis buffer (20 mM Bis-tris, 500 mM 
e-aminocaproic acid, 20 mM NaCl, 10% (w/v) glycerol, 0.5% digitonin, 0.5 mM 
Na3VOy,, 1mM PMSE, 0.5mM NaF, 1 x EDTA-free Roche protease inhibitor cocktail, 
pH 7.0) for 15 min on ice. Cell lysates were clarified by centrifugation at 20,000g 
for 30 min at 4°C and analysed without further purification steps. Cell lysates were 
equalized after quantification of total protein using the BCA protein assay (Pierce), 
and then separated by 4-12% blue native PAGE. Native gels were incubated in 10% 
SDS solution for 5 min before transfer to PVDF membranes (Millipore), followed 
by conventional western blotting. In 2D PAGE, a gel slice of the natively resolved 
gel was placed in a dish containing 1 x Laemmli sample buffer (12.5 mM Tris-Cl 
(pH 6.8), 5% B-mercaptoethanol 4% (w/v) SDS, 0.1% bromophenol blue, 20% (v/v) 
glycerol) for 10 min, microwaved on high for 20 s, and rocked for another 15 min 
before loading the slice into a well 4-12% SDS-PAGE gel. Because the speck was 
mostly lost in the pellet fraction, the analyses of the NEK7—-NLRP3 interaction 
probably excluded proteins in the speck. 

Potassium efflux assay. Intracellular potassium measurements were performed 
as described previously*. In brief, macrophages were plated on 96-well plates 1 day 
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before experiments. Culture medium was thoroughly aspirated after stimulation. 
Cells were lysed with 3% ultrapure HNO3. Intracellular potassium was then 
determined by inductively coupled plasma optical emission spectrometry with 
an Optima 2000 DV spectrometer (PerkinElmer Life Sciences) using yttrium as 
an internal standard. 

Cytokine measurements. Cytokines were measured with ELISA kits (R&D 
Systems). 

Stimulation with endotoxin in vivo. Mice were injected intraperitoneally with 
20 mg kg™! LPS (Escherichia coli 0111:B4; Sigma). The investigators were not 
blinded to allocation during experiments. The blood samples were collected 3 h 
later. Serum cytokines were measured by ELISA. The analysis was performed 
blindly by an independent researcher, and the experiments were not randomized. 
MSU- induced mouse peritonitis. Chimaeric mice were injected intraperitoneally 
with 1 mg MSU dissolved in 0.5 ml sterile PBS. Mice were euthanized 6h later 
and peritoneal cavities were flushed with 5 ml cold PBS. Peritoneal lavage fluids 
were collected and cytokines were measured by ELISA after concentration using 
an Amicon Ultra 10K filter from Millipore. The investigators were not blinded to 
allocation during experiments. Serum cytokines were measured by ELISA. The 
analysis was performed blindly by an independent researcher, and the experiments 
were not randomized. 

Statistical analysis. No statistical methods were used to predetermine sample size. 
All analyses were performed using GraphPad Prism. Differences were considered 
significant when P< 0.05. 
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Extended Data Figure 1 | NEK7 interacts with NLRP3 upstream 

of ASC and caspase-1. a, Immortalized Nirp3~/~ macrophages were 
infected with lentiviruses containing pHIV vector (Vector) or pHIV- 
NLRP3-SFP (NLRP3-SFP). Transduced macrophages were sorted by 
flow cytometry. Cells were treated as indicated. Supernatants and cell 
lysates were collected and analysed by immunoblotting with indicated 
antibodies. b, Immunoblots showing that NEK7, but not NEK6 or NEK9, 
interacts with NLRP3 in macrophages. NLRP3-associated proteins were 
pulled down from reconstituted Nirp3~/~ macrophages by streptavidin 
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beads, and analysed by immunoblotting with indicated antibodies. 

c, ATP stimulation, but not LPS priming, greatly enhances the NEK7- 
NLRP3 interaction. d, LPS-primed BMDMs from wild-type, Asc ~~ and 
Casp1~/~ Casp11~‘~ were left untreated or treated with 5 mM ATP for 

30 min. NLRP3 protein complexes were immunoprecipitated with anti- 
NLRP3 antibody and analysed by immunoblotting. Data are representative 
of three (a, b) or two (c, d) independent experiments. See Supplementary 
Figs 2 and 3 for gel source data. 
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Extended Data Figure 2 | NEK7 deficiency does not affect TNF-a priming were below 50 pg ml”! in the supernatants. c-e, BMDMs from 
release, but abrogates NLRP3 activation induced by cytosolic LPS in C57BL/6, Casp11~/~, Nek7~’~ (fetal liver chimaeras) and Nirp3~/~ mice 


macrophages. a, TNF-a secretion from LPS-primed BMDMs treated with were primed with LPS for 4h before transfection with or without 
ATP, nigericin, gramicidin, poly(dA:dT) or Salmonella. b, TNF-a secretion LPS with DOTAP. Four hours after transfection, caspase-1 activation (c), 


in LPS-primed BMDMs treated with LLOMe, silica, Alum, MSU, CPPD IL-16 release (d) and cytotoxicity (e) were analysed. Graphs show 
or nano-SiO2. Mock represents macrophages primed with LPS without the mean and s.d. of triplicate wells, and are representative of three 
further stimulation. The amounts of TNF-a release in the absence of LPS independent experiments. See Supplementary Fig. 3 for gel source data. 
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Extended Data Figure 3 | NEK7 is required for activation of the NLRP3 
inflammasome in macrophages. a—c, Defective NLRP3 inflammasome 
in NEK7-deficient macrophages. Wild-type and NEK7-deficient 
macrophages generated by CRISPR-Cas9 genome editing were primed 
with LPS, followed by stimulation with 51M nigericin (1h), 5mM ATP 
(30 min) or 200 ngml~' Alum (6h). Cell lysates were collected and 


analysed for caspase-1 activation by immunoblotting (a). IL-1 (b) and 
TNF-a (c) release were analysed by ELISA. ELISA data (b, c) show the 
mean and s.d. of triplicate wells. The amounts of TNF-a release in the 
absence of LPS priming were below 50 pg ml in the supernatants. Data 
are representative of three independent experiments. See Supplementary 
Fig. 3 for gel source data. 
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Extended Data Figure 4 | NEK7 is critical for activation of the NLRP3 
inflammasome in primary macrophages. a, NEK7 is dispensable 

for NLRP3 induction and expression of the NLRP3 inflammasome 
components ASC and caspase-1. BMDMs were treated with control 
shRNA or Nek7 shRNAs and selected by culture in 31g ml“! 

puromycin. Puromycin-resistant macrophages were left unstimulated or 
stimulated with LPS for 4h. Cell lysates were collected and analysed by 
immunoblotting. Actin is shown as a loading control. b, c, NEK7 depletion 
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control shRNA or Nek7 shRNAs were primed with LPS, followed by 
stimulation with 5mM ATP (30 min), 54M nigericin (1h) or 500ng ml“! 
silica (4h). Supernatants and cell lysates were collected after stimulation 
and analysed by immunoblotting for caspase-1 activation (b). IL-16 (c) 
and TNF-a (d) release were analysed by ELISA. ELISA data (c, d) show 
the mean and s.d. of triplicate wells. The amounts of TNF-a release in the 
absence of LPS priming were below 50 pg ml’ in the supernatants. Data 
are representative of three independent experiments. See Supplementary 


inhibits activation of the NLRP3 inflammasome. BMDMs treated with Fig. 3 for gel source data. 
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Extended Data Figure 5 | NEK7 is critical for NLRP3-mediated 

ASC oligomerization and speck formation in macrophages. 

a, b, Representative confocal immunofluorescence images (a) and 
quantification of endogenous ASC specks (arrows) (b) in macrophages 
treated with control shRNA or Nek7 shRNA. Macrophages were 
primed with LPS and then stimulated with ATP, nigericin, poly(dA:dT) 
or Salmonella. After stimulation, cells were fixed and stained for ASC 
(green). DAPI was used for staining nuclear DNA (blue). Data shown 
represent results from three combined independent experiments in 


Control shRNA 


Nek7 shRNA-1 


which more than 300 cells were counted in each experiment. Error bars 
indicate s.d. Original magnifications, x 400. c, Immunoblots showing 
ASC oligomerization in control-shRNA-treated or Nek7-shRNA-treated 
macrophages. LPS-primed macrophages were treated as indicated. 

Cell lysates (Triton X-100 soluble) and BS?-crosslinked pellets (Triton 
X-100 insoluble) were analysed by immunoblotting. Mock represents 
macrophages primed with LPS without further stimulations. Results are 
representative of two independent experiments. *P < 0.05 (two-tailed 
Student's t-test). See Supplementary Fig. 3 for gel source data. 
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51M nigericin for 1h after LPS priming. Cell lysates were collected 
and analysed for caspase-1 activation by immunoblotting (a). IL-1 (b) 
and TNF-a (c) release were analysed by ELISA. ELISA data (b, c) 


Extended Data Figure 6 | NEK7 kinase activity is dispensable for 
activation of the NLRP3 inflammasome in macrophages. a-c, 


Immortalized NEK7-deficient macrophages were infected with lentivirus 
containing empty vector or vector expressing wild-type or indicated show the mean and s.d. of triplicate wells. Data are representative 


mutant NEK7. Transduced macrophages were sorted by flow cytometry of three independent experiments. See Supplementary Fig. 3 for gel 
using GFP as a marker. Macrophages were plated and stimulated with source data. 
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Extended Data Figure 7 | NEK6 and NEK9 are dispensable for 
activation of the NLRP3 inflammasome in macrophages. 
a-c, BMDMs were treated with control shRNA, Nek7 shRNA, Nek6 


shRNA or Nek9 shRNA and selected by culture with 3 1g ml~! puromycin. 


Puromycin-resistant cells were plated and primed with LPS, followed by 
stimulation with 5mM ATP for 30 min. Cell lysates were collected and 
analysed for caspase-1 activation by immunoblotting (a). IL-18 (b) and 
TNF-a (c) release were analysed by ELISA. d, Representative confocal 


immunofluorescence images (d) and quantification of endogenous ASC 
specks (arrows) (e) in macrophages. DAPI was used for staining nuclear 
DNA (blue). Quantification of ASC speck was from three combined 
independent experiments in which more than 300 cells were counted in 
each experiment. Error bars indicate s.d. Scale bars, 101m. ELISA data 
(b, c) show the mean and s.d. of triplicate wells. Data are representative of 
three independent experiments. *P < 0.05 (two-tailed Student's t-test). 
See Supplementary Fig. 3 for gel source data. 
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Extended Data Figure 8 | Microtubule depolymerization does not inhibit 
the activation of the NLRP3 inflammasome and the NEK7-NLRP3 
interaction in macrophages. a—e, LPS-primed BMDMs were pretreated 
with vehicle control (DMSO), nocodazole (1, 10 or 501M) or colchicine 

(1, 10 or 5041M) for 1 h before stimulation with 51M nigericin. Cell lysates 
were collected and analysed for caspase-1 activation by immunoblotting (a). 
IL-16 (b) and TNF-a (c) release were analysed by ELISA. Representative 
confocal immunofluorescence images (d) and quantification (e) of 
endogenous ASC specks (arrows) in macrophages treated as indicated. 
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LPS — LPS + Nigericin 


Microtubules were stained with anti-o/3 tubulin antibody (red). DAPI was 
used for staining nuclear DNA (blue). Data shown in e represent results 
from three combined independent experiments in which more than 300 cells 
were counted in each experiment. Error bars indicate s.d. Scale bars, 10 jm. 
f, The NEK7-NLRP3 interaction in untreated and treated macrophages was 
analysed by immunoprecipitation and immunoblotting. ELISA data (b, c) 
show the mean and s.d. of triplicate wells. Data are representative of 

three independent experiments. *P < 0.05 (two-tailed Student's t-test). 

See Supplementary Fig. 3 for gel source data. 
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Extended Data Figure 9 | Inhibition of BTK does not inhibit the NEK7- 
NLRP3 interaction in macrophages. a, b, Reconstituted immortalized 
Nirp3~/~ macrophages were primed with LPS for 4h. Cells were left 
untreated or treated with indicated concentrations of BTK inhibitor 
LFM-A13 for 30 min before ATP stimulation. Cell lysates were collected 
and analysed for caspase-1 activation by immunoblotting (a). Cell lysates 


ee ee Nip 3-SFP 


SS tin 


were collected and subjected to pull-down with streptavidin beads, and 
the precipitated protein complex was analysed by immunoblotting with 
indicated antibodies (b). Agarose beads were used as a control. Actin is 
shown as a loading control. Data are representative of two independent 
experiments. See Supplementary Fig. 3 for gel source data. 
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Extended Data Figure 10 | NEK7-deficiency does not inhibit potassium primed with LPS without further stimulation. Graphs show the 
efflux induced by NLRP3 stimuli in macrophages. LPS-primed Nek7*/* mean and s.d. of four technical replicates and are representative of 
and Nek7~/~ BMDMs were stimulated as indicated. The intracellular two independent experiments. 

potassium in each condition was analysed. Mock represents macrophages 
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Cryo-EM reveals a novel octameric integrase 
structure for betaretroviral intasome function 


Allison Ballandras-Colas!, Monica Brown2, Nicola J. Cook?, Tamaria G. Dewdney!, Borries Demeler*, Peter Cherepanov 


Dmitry Lyumkis? & Alan N. Engelman! 


Retroviral integrase catalyses the integration of viral DNA into 
host target DNA, which is an essential step in the life cycle of all 
retroviruses!. Previous structural characterization of integrase- 
viral DNA complexes, or intasomes, from the spumavirus prototype 
foamy virus revealed a functional integrase tetramer”, and it is 
generally believed that intasomes derived from other retroviral 
genera use tetrameric integrase®°. However, the intasomes of 
orthoretroviruses, which include all known pathogenic species, 
have not been characterized structurally. Here, using single-particle 
cryo-electron microscopy and X-ray crystallography, we determine 
an unexpected octameric integrase architecture for the intasome of 
the betaretrovirus mouse mammary tumour virus. The structure 
is composed of two core integrase dimers, which interact with the 
viral DNA ends and structurally mimic the integrase tetramer of 
prototype foamy virus, and two flanking integrase dimers that engage 
the core structure via their integrase carboxy-terminal domains. 
Contrary to the belief that tetrameric integrase components are 
sufficient to catalyse integration, the flanking integrase dimers 
were necessary for mouse mammary tumour virus integrase activity. 
The integrase octamer solves a conundrum for betaretroviruses as 
well as alpharetroviruses by providing critical carboxy-terminal 
domains to the intasome core that cannot be provided in cis because 
of evolutionarily restrictive catalytic core domain-carboxy-terminal 
domain linker regions. The octameric architecture of the intasome 
of mouse mammary tumour virus provides new insight into the 
structural basis of retroviral DNA integration. 

Mouse mammary tumour virus (MMTV) intasomes were assem- 
bled from integrase (IN) and viral DNA (vDNA) components by dif- 
ferential salt dialysis, akin to the strategy used for prototype foamy 
virus (PFV) intasomes’. Fractionation of assembly reactions by size- 
exclusion chromatography (SEC) revealed a higher-order species 
with a distinct elution profile from those of IN and vDNA 
(Fig. 1a). To address biological relevance, strand transfer reactions 
were conducted with supercoiled plasmid target DNA (tDNA) to 
monitor the concerted integration of two vDNA ends" (Fig. 1b). 
The SEC-purified complexes catalysed efficient concerted integra- 
tion activity, which was inhibited by the IN strand transfer inhib- 
itor raltegravir (Fig. 1c). The sequencing of concerted integration 
products excised from agarose gels revealed that most contained 
6 base pair (bp) target site duplications flanking the integrated 
vDNA ends, which are known to occur during MMTV infection! 
(Fig. 1d). To address the specificity of complex formation, the invariant 
CA dinucleotide, which is essential for IN catalysis!*19, was mutated 
to GT in the vDNA substrate. As the mutant vDNA failed to support 
complex formation (data not shown), we conclude that the higher- 
order species identified by SEC are bona fide MMTV intasomes. 
We note that divalent metal ion was a critical cofactor for MMTV 
intasome formation. On the basis of prior reports that Ca?* promoted 


335. 
) 


the assembly of active HIV-1 IN-vDNA complexes but was unable to 
support IN catalysis'4, it was used here for intasome assembly. 

To determine the MMTV intasome structure, single-particle 
cryo-electron microscopy (cryo-EM) data was collected on a Titan 
Krios microscope equipped with a Gatan K2 direct detector. Computa- 
tional processing of the data revealed the most stable structural con- 
formation of the complex, which was refined to ~5-6 A for different 
regions of the map (Fig. 2a and Extended Data Figs 1 and 2). The 
MMTV intasome is composed of central core density as well as flank- 
ing density regions that are conformationally mobile compared with 
the intasome core (Extended Data Fig. 3). Restricting data refinement 
to the core density region accordingly increased the resolution for the 
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Figure 1 | MMTV intasome (Int) characterization. a, Purification by 
SEC. Elution positions of mass standards in kilodaltons are indicated. 
b, Integration assay schematic. Intasome or IN plus vDNA was reacted 
with supercoiled tDNA, which can yield half-site (h.s.) or concerted 
integration (c.i.) products. c, Ethidium bromide stained agarose gel. 
Reactions shown in lanes 1-3 were initiated with IN; vDNA was omitted 
from lane 1. Raltegravir (RAL) was included as indicated. Lanes 4 and 5, 
intasome reactions. Migration positions of half-site products that 
co-migrate with open circular (0.c.) tDNA, concerted integration products, 
supercoiled (s.c.) tDNA and mass standards in kilobases are indicated. 
For gel source data, see Supplementary Fig. 1. d, Sequenced intasome- 
mediated concerted integration products (n= 35). 
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Figure 2 | Cryo-EM structure of the MMTV intasome. a, Top view 
(upper) of the cryo-EM map; the lower view is rotated by 90°. Core density 
and flanking density regions are indicated. b, Individual domain crystal 


central portion of the structure to ~4 A for the best-resolved regions 
(Extended Data Fig. 2d). 

Each IN monomer is composed of an amino (N)-terminal domain 
(NTD), a catalytic core domain (CCD) and a carboxy (C)-terminal 
domain (CTD) (Extended Data Fig. 4a), and the map was sufficiently 
detailed to readily assign these domains to their corresponding 
cryo-EM densities. Given a lack of MMTV IN structures, the different 
protein domains were crystallized as INccp, INcrp and INnrp-ccp 
fragments, and these structures were refined to 1.7 A,1.5Aand2.7A 
resolution, respectively (Extended Data Fig. 5 and Extended Data 
Table 1). MMTV DNA was modelled using PFV intasome DNA 
coordinates and by extending the modelled fragment by 3 bp in the 
region distal from the IN active sites to account for the different VDNA 
lengths. Using rigid-body docking, the two vVDNAs and eight NTDs, 
CCDs and CTDs were unambiguously positioned into the cryo-EM 
map (Fig. 2b). Rosetta'*-!” was used to refine the X-ray structures 
and vDNA, and to build a subset of interdomain linker regions to 
best fit the density within the intasome core region (Extended Data 
Fig. 6 and Supplementary Videos 1-5). The resulting model revealed 
two molecules of VDNA and eight MMTV INs arranged as four IN 
dimers (Fig. 3a). Two catalytic IN dimers A and B are positioned in 
the core region in close contact to the VDNAs, whereas supportive IN 
dimers C and D locate to the flanking density regions, donating their 
CTDs to the core. Flexible linkers connect the IN domains, and the 
NTD-CCD linker, which is contracted in most IN protomers, extends 
in IN, and IN; to donate these NTDs in trans to opposing CCDs 
(Fig. 3a and Extended Data Fig. 6e). Sedimentation velocity centrif- 
ugation indicated the molecular mass of active MMTV intasomes as 
302.1 kDa, which is fully consistent with the octameric IN structure 
(calculated INs—vDNA,2 = 313.6 kDa; Extended Data Fig. 4b). 

The structures of the MMTV and PFV intasomes were 
compared to ascertain aspects of the new structure important for 
DNA recombination (Fig. 3a). The PFV intasome is composed 
of two IN dimers A and B, with the inner protomers of each 
dimer (IN, and IN;; red and green in Fig. 3a) adopting extended 
conformations”. The NTDs and CTDs of the outer IN protomers 
(chartreuse (light green) and orange in Fig. 3a) are unseen in PFV 
intasome density maps. The architecture in the core density region 
of the MMTV intasome is strikingly similar to the PFV structure. 


structures (NTD, green; CCD, orange; CTD, purple) and vDNA (blue) 
model fitted by rigid body docking. Rulers demarcate 20 A. 


For example, the positions of the CCDs and NTDs that contact 
vDNA (red IN, and green IN; in Fig. 3a) are analogous. The two 
remaining NTDs in the core region stabilize the CCD dimer interface 
in an arrangement identical to that seen in the INnrp-ccp crystal 
structure (Extended Data Figs 5d and 6e). Both flanking density 
regions contain a CCD dimer that is also stabilized on each side by 
NTDs, mimicking the CCD dimer arrangements found in the core 
density region. 

The arrangements of the CTDs differ dramatically between the 
MMTV and PFV structures, with MMTV IN residue Arg240 medi- 
ating several key contacts. For example, core protomer IN; Arg240 
interacts with VDNA while IN Arg240 interacts with IN; Asp233 
(Fig. 3b). Flanking protomer IN; Arg240 engages its INg neighbour 
whereas INg Arg240 mediates an inter-dimeric interaction with core 
protomer IN; Asp223, docking the flanking IN dimer to the core 
structure (Fig. 3b). To test the functionality of the flanking IN dimers, 
complementation assays were performed by varying ratios of wild-type 
(INwr) and mutant IN proteins in strand transfer reactions. Similar 
strategies were used previously to dissect the division of labour within 
multimeric complexes of retroviral IN'*-*! as well as the related 
bacteriophage Mu transpososome””. 

INpoaogs Which like INwr purified as a dimer (Extended Data Fig. 7), 
was defective for strand transfer activity (Fig. 4a). To assess the func- 
tionality of Arg240-mediated IN-IN interactions, we compared 
INp2a0g With INxi6sz, which carries a change that uniquely disrupts 
IN-vDNA binding”. In concordance with its inability to assume the 
roles of inner IN, and IN; subunits of the core tetramer, INxj6sz mildly 
stimulated the activity of limited INwr protein (Fig. 4b), presumably 
providing a source for other IN subunits within the functional complex. 
INg240r by contrast potently inhibited INwy function, confirming 
the importance of Arg240-mediated protein-protein interactions 
for MMTV IN activity. Two deletion mutant constructs, INccp-crp 
and INcrp, which purified as dimers and monomers, respectively 
(Extended Data Fig. 7), were additionally analysed. The reaction com- 
posed of 75% INccp-crp supported near INwr activity, indicating that 
this mutant could function when present in up to six of eight octamer 
positions. This finding strongly supports flanking IN dimer function- 
ality, as the absence of the NTD would likewise preclude INccp-crp 
from assuming intasome core positions 1 and 3. As the INcrp response 
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Figure 3 | Comparison of MMTV and PFV intasome structures. 

a, MMTV (left) and PFV (right) intasomes colour coded to highlight IN 
dimers and constituent protomers. Core dimers A and B are red-orange 
and green-chartreuse (light green), respectively, while MMTV flanking 
IN dimers C and D are blue-sky blue and purple-light pink, respectively. 
Coloured circles highlight similarly positioned CTDs between structures. 
b, Close-up views of Arg240-mediated protein (left) and vDNA (right; G6 
of plus-strand) interactions. For simplicity, only one set of asymmetric 
interactions is shown. The interaction of IN; with residues 258-261 of INg 
varied during model refinement, with the indicated interaction (as well as 
other atomic distances) observed in the final refined model. Colours are 
conserved between a and b. 


curve overlaid that predicted for non-functional protein, we moreover 
conclude that CCD-mediated dimerization is critical for flanking IN 
CTD function (Fig. 4). 

Analysis of IN primary sequences suggests an explanation for the 
octameric arrangement of IN within the MMTV intasome when 
an IN tetramer suffices for PFV integration. Whereas fifty-residue 
CCD-CTD linkers afford the positioning of inner PFV IN CTDs 
for VDNA and tDNA engagement”’, the analogous eight-amino-acid 
MMTV linker is simply too short to accomplish the task (Extended 
Data Fig. 8a). MMT'V has accordingly evolved to employ flanking 
IN dimers to nestle CTDs into the core intasome structure to pro- 
vide essential CTD function in trans for integration. As flanking IN 
dimer CTDs 6 and 8 structurally mimic the PFV domains (Fig. 3a and 
Extended Data Fig. 8a), we presume these CTDs will engage tDNA 
during MMTV integration. Extending our analysis to other retrovi- 
ruses indicates that in addition to the spumaviruses, IN tetramers 
could suffice for gamma- and epsilonretroviral intasome activity, 
while an IN octamer will be required to catalyse alpharetrovirus in 
addition to betaretrovirus integration (Extended Data 
Fig. 8b). We note that an octameric IN architecture for the 
alpharetrovirus Rous sarcoma virus intasome has recently been 
independently determined”*. Whereas most studies have high- 
lighted a tetramer as the IN species that catalyses concerted 
HIV-1 integration”>®, others have implicated a role for 
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Figure 4 | MMTV intasome functionality. a, Representative agarose 
gels. The reactions in lanes 1-4 contained 1.0, 0.75, 0.5, 0.25 1M INw 
respectively; IN was omitted from the reaction in lane 5. Subsequent five- 
reaction sets contained the same INwr concentrations with 0, 0.25, 0.5, 
0.75, 1.0,1M of the indicated mutant protein, for a total concentration of 
141M IN in lanes 6-25. Lanes 1-5 versus lanes 6-15 and 16-25 were from 
separate agarose gels (see Supplementary Fig. 1 for gel source 

data); other labelling as in Fig. 1. b, Dashed lines indicate theoretical 
activities (graphed as percentage INwr activity) for mixtures that 
contain a mutant protein that supports full INwr function when 

present in six of eight octamer positions (blue dashed line), four of eight 
positions (green dashes), two positions (purple dashes) or is unable to 
complement INwr function (pink dashes). Actual activities are from 
four technical replicates (average + s.e.m.; see Supplementary Table 1 

for source data). The nonlinear response of INwr (grey line with red 
diamonds) probably reflects concentration-dependent cooperative 
multimerization of IN with VDNA*’. The INwr alone and INwr+INcrp 
values were not significantly different (P > 0.1; two-tailed t-test). 
*P<0.05; **P<0.01. 


octameric IN?””%. Given the intermediary length of lentiviral IN 
CCD-CTD linker regions (Extended Data Fig. 8b), the 
higher-order nature of IN in active HIV-1 intasomes may need to be 
re-evaluated. 

PFV IN, which cleaves tDNA phosphodiester bonds separated by 
4bp, preferentially integrates into flexible sequences, whereas MMTV 
and Rous sarcoma virus IN, which cleave tDNA with 6 bp staggers, 
are relatively unconstrained by tDNA flexibility*’. Superposition of 
the MMTV and PFV intasome structures revealed that the two sets 
of catalytic IN active sites almost perfectly aligned (Extended Data 
Fig. 8c). The same practical spacing of IN active sites therefore catal- 
yses PFV and MMTV integration into sharply bent versus relatively 
non-deformed tDNA, respectively (Extended Data Fig. 8d). Owing to 
their positions in the structure, we note that the flanking IN dimers 
dramatically expand the potential contact area with tDNA, which is 
likely to have consequences for the docking of alpha- and betaretroviral 
intasomes to host chromatin. 
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METHODS 


Statistical methods were not used to predetermine sample sizes. Experiments were 
not randomized and the investigators were not blinded to allocation during exper- 
iments and outcome assessment. 
DNA constructs. Full-length (FL) MMTV IN*! and INcrp (IN212-266 and IN212-319) 
expression constructs provided N-terminal His¢ tags followed by human rhinovirus 
(HRV) 3C protease cleavage sites. The INnrp-ccp expression construct was made 
by introducing a stop codon after the TCA that encodes for IN residue Ser212. 
INkisse and INgoaoz expression constructs were made by PCR-directed mutagenesis. 
DNA fragments corresponding to INs1-212 (INccp) and INs1-319 (INccp-crp) were 
amplified by PCR and subcloned into expression vector pET-20b (Novagen); these 
proteins harboured cleavable C-terminal His, tags. The sequences of all PCR ampli- 
fied regions of plasmid DNAs were verified by sequencing. 
Protein expression and purification for intasome and IN activity assays. FL 
INs, INccp-crp and INcrp/2i2-319 Were expressed in Escherichia coli strain PC2 
(ref. 32) in LB broth (supplemented with 501M ZnCl, for FL INs) by induction 
with 0.4 mM isopropyl B-p-1-thiogalactopyranoside (IPTG) (1 mM IPTG for 
INccp-crp) at 30°C (37°C for INccp-crp and INcrp) for 4h. Bacteria pellets were 
resuspended in 20 mM HEPES, pH 7.6, 1 M NaCl, 5mM 3-[(3-cholamidopropyl) 
dimethylammonio]-1-propanesulfonate (CHAPS), complete EDTA-free protease 
inhibitor (Roche). After sonication for 5 min at 50 mA, cell lysates were centrifuged 
at 45,000 g for 1h. The supernatant, supplemented with 5 mM imidazole, was fil- 
tered through a 0.45 1m filter and purified using a Ni?'-charged HisTrap 5 ml 
column (GE Healthcare) equilibrated with 20mM HEPES, pH 7.6, 1M NaCl, 5mM 
CHAPS, 15mM imidazole. Proteins were eluted by a linear gradient of imidazole 
(15-500 mM) containing a step wash at 65 mM imidazole using the AKTA purifier 
system (GE Healthcare; for INccp-crp; a second step wash was done at 115mM 
imidazole). IN-containing fractions were diluted 1:5 with 20 mM HEPES, pH 7.6, 
5mM CHAPS, 2mM dithiothreitol (DTT) and immediately loaded on a Heparin 
HiTrap 5 ml column equilibrated with 20 mM HEPES, pH 7.6, 200 mM NaCl, 5mM 
CHAPS, 2mM DTT. Proteins were eluted by a linear NaCl gradient from 200 mM 
to 2M (INcrp was isolated in the column flow through). IN-containing fractions 
were pooled and cleaved with HRV 3C protease (GE Healthcare) overnight at 4°C 
to remove the Hise tag. In lieu of purification by Heparin HiTrap, INccp-crp was 
dialysed against 20 mM HEPES, pH 7.6, 1 M NaCl, 5mM CHAPS, 2mM DTT, 
2mM EDTA at 4°C for 2h, cleaved with HRV 3C protease overnight at 4°C, fol- 
lowed by dialysis against 20 mM HEPES, pH 7.6, 1 M NaCl, 5mM CHAPS, 2mM 
DTT, 0.5mM EDTA (SEC1 buffer). Cleaved proteins were purified by SEC using 
a Superdex 200 10/300 column (GE Healthcare) equilibrated with SEC1 buffer. 
Purified INs were concentrated by ultracentrifugation using 10-kDa molecular 
mass cutoff Millipore concentrators and dialysed overnight against SEC1 buffer 
supplemented to contain 10% glycerol. Protein concentration was determined by 
spectrophotometry, and aliquots flash-frozen in liquid N> were stored at —80°C. 
Purified INs were analysed by SEC using a Superdex 3.2/300 column equilibrated 
with SEC] buffer; protein standards were from Bio-Rad. 
MMTV intasome assembly. Intasomes were assembled by mixing 128 |.M MMTV 
IN with 38 |.M 22 bp preprocessed vVDNA (5’-CAGGTCGGCCGACTGCGGCA/ 
5’-AATGCCGCAGTCGGCCGACCTG) in 20 mM HEPES, pH 7.6, 600 mM NaCl, 
2mM DTT, before dialysis for 16h at 4°C against 25 mM Tris-HCl, pH 7.4, 80mM 
NaCl, 2mM DTT, 251M ZnCh, 10 mM CaCl. The resulting milky white precipi- 
tate was dissolved by adding NaCl to the final concentration of 250 mM, followed 
by incubation on ice for 1h. After centrifugation for 10 min at 20,000 g at 4°C, 
soluble intasomes were purified by SEC using Superdex 200 10/300 equilibrated 
with 25 mM Tris-HCl, pH 7.4, 200mM NaCl, 2mM DTT, 251M ZnCh, 10 mM 
CaCl, (SEC2 buffer). Intasome-containing fractions, which eluted around 10.5 ml, 
were concentrated by ultracentrifugation using 10-kDa cut off concentrators. 
In vitro integration assays. Strand transfer assays were performed as described 
previously*’. Briefly, 1 1M intasome or 1}.M MMTV IN plus 0.51.M vDNA were 
mixed with 300 ng pGEM-3 tDNA in 4011 of 20mM HEPES, pH 7.4, 60 mM NaCl, 
5mM MgCh, 441M ZnSO,, 10mM DTT. Reactions incubated for 1h at 37 °C were 
terminated by adding 25 mM EDTA-0.5% SDS. DNA products deproteinized by 
digestion with proteinase K and precipitated with ethanol were analysed by elec- 
trophoresis through 1.5% agarose gels and visualized by staining with ethidium 
bromide. Raltegravir, which was used at the final concentration of 100|1M, was 
obtained from Selleck Chemicals. Proteins were premixed on ice before addition to 
reactions for biochemical complementation assays. Concerted integration products 
were measured by band intensity quantification relative to INwr product forma- 
tion, which was set to 100% using Molecular Imager Gel Doc TM XR+ System 
with Image Lab software (BioRad); the background across eight gel images corre- 
sponded to 1.26% + 0.47% of INwr function. 

Concerted integration reaction products were cloned and sequenced essen- 
tially as previously described*”. Briefly, DNA excised from agarose gels was 
repaired using Phi29 DNA polymerase (New England Biolabs) and ligated to a 


PCR-amplified kanamycin resistance cassette. Plasmids recovered after transfor- 
mation of ligation mixtures into E. coli were sequenced using primers that annealed 
to the ends of the cassette DNA. 

Analytical ultracentrifugation. We analysed sedimentation velocity at 20°C ina 
Beckman Optima XL-I analytical ultracentrifuge using an An60Ti rotor and stand- 
ard two-channel Epon Centerpieces (Beckman-Coulter). Samples were prepared 
in 20mM phosphate buffer, pH 7.5, 150 mM NaCl] at two loading concentrations, 
absorbance (A289 nm) values of 0.3 and 0.9 for MMTV IN and the intasome, and 
A2g0 nm Values of 0.18 and 0.53 for VDNA, to exclude potential mass action oli- 
gomerization. IN and vDNA were spun simultaneously at 35,000 r.p.m. for 22h 
while the intasome was spun at 27,000 r.p.m. for 12h; the different rotor speeds 
were based on the predicted masses of the different macromolecules. 

Data were analysed using UltraScan-TIII version 2.2, release 2000 (ref. 33). 
Hydrodynamic corrections for buffer density and viscosity were estimated with 
UltraScan to be 1.041 g ml! and 1.101 centipoise, respectively. The partial specific 
volume of IN (0.728 ml g~!) was estimated by UltraScan from its protein sequence 
using a method analogous to the methods outlined in ref. 34. Sedimentation velocity 
data were analysed as described**. Optimization was performed by two-dimensional 
spectrum analysis*© with simultaneous removal of time-invariant and radially- 
invariant noise contributions*”. Two-dimensional spectrum analysis solutions, 
which are subjected to parsimonious regularization by genetic algorithm analysis**, 
were further refined using Monte Carlo analysis to determine confidence limits for 
the determined parameters*’. Calculations were performed on the Lonestar cluster 
at the Texas Advanced Computing Center at the University of Texas at Austin. 
Protein expression and X-ray crystallography. MMTV INccp; INnrp-ccp and 
INcrp fragments spanning MMTV IN residues 51-212, 1-212 and 212-266, 
respectively, were expressed in BL21(DE3)-CodonPlus cells (Stratagene) in LB 
medium (supplemented with 501M ZnCl, for INnrp-ccp) by induction with 0.01% 
(w/v) IPTG. Bacteria were lysed by sonication in 0.5 M NaCl, 50 mM Tris-HCl, pH 
7.4, and the proteins were isolated by absorption to Ni-nitrilotriacetic acid agarose 
(Qiagen). After digestion with HRV 3C protease to release His, tags, the proteins 
were further purified by ion exchange and SEC. 

Crystals were grown by vapour diffusion in hanging drops by mixing 1 ll 
protein (6-10 mg ml! in 200 mM NaCl, 2mM DTT, 25 mM Tris-HCl, pH7.5) 
and 1 ul reservoir solution, which contained 12.5% PEG-3350, 0.15 M ammo- 
nium citrate, pH 6.5 (INccp), 19% PEG-3350, 0.2 M MgCh, 5% (v/w) 1-butyl-3- 
methylimidazolium dicyanamide (INnrp-ccp) or 19% isopropanol, 50mM ammo- 
nium acetate, 0.1 M HEPES-NaOH, pH 7.5 (INcrp). Crystals, cryoprotected 
with 25% glycerol (INccp; INnrp-ccp) or 30% PEG-400 (INcrp), were frozen 
by immersion in liquid nitrogen. Diffraction data for the INccp were collected 
using a charge-coupled device detector at beamline BM14 (European Synchrotron 
Radiation Facility) whereas INcrp and INnrp-ccp crystals were analysed at beam- 
line 103 (Diamond Light Source) equipped with a PILATUS direct detector. The 
data, integrated with XDS“°, were scaled with Aimless*!. The structures, which were 
each derived from a single crystal, were solved by molecular replacement in Phaser” 
using search models generated from PDB entries 1ASV (CCD)*8, 3F9K (NTD)!° 
and 1EX4 (CTD). The models were rebuilt using ARP/wARP* and/or manually 
in Coot’® and refined in Phenix’” and/or Refmac**. Pseudo-merohedral twin law 
(-h,-k,]) was accounted for during refinement of the INnrp-ccp structure. Final 
models, validated with MolProbity”, had at least 96.9% of residues in the favoured 
regions and none in the disallowed regions of the Ramachandran plot. Detailed 
X-ray data collection and refinement statistics are given in Extended Data Table 1. 
Cryo-EM data acquisition. Sample containing MMTV intasomes in SEC2 buffer 
supplemented to contain 0.05% NP-40 was applied onto freshly plasma treated 
(6s, Gatan Solarus plasma cleaner) holey carbon C-flat grids (CF-1.2/1.3-4C, 
Protochips), adsorbed for 30s and then plunged into liquid ethane using a manual 
cryo-plunger in an ambient environment of 4°C. 

Data were acquired over three separate sessions using Leginon software™’ 
installed on an FEI Titan Krios electron microscope operating at 300 kV, with a 
dose of 40 electrons per pixel per square angstrém at a rate of ~6.9 electrons per 
pixel per second and an estimated underfocus ranging from 1 to 4\1m (centred at 
2.6 +0.6,1m). The dose was fractionated over 50 raw frames collected over a 10-s 
exposure time (200 ms per frame) on a Gatan K2 Summit direct detection device, 
with each frame receiving a dose of ~6.5 electrons per pixel per second. Two thou- 
sand seven hundred and fourteen movies were collected and recorded at a nominal 
magnification of 22,500, corresponding to a pixel size of 1.31 A at the specimen 
level. The individual frames were gain corrected, aligned and summed using a 
graphic processing unit-enabled whole-frame alignment program as previously 
described®!, and exposure filtered’ according to a dose rate of 6.9 electrons per 
pixel per second. See Extended Data Table 2 for additional details on cryo-EM 
data collection. 

Cryo-EM image analysis. Pre-processing operations before the refinement of the 
final models were performed using the Appion package” and were conceptually 
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identical to those previously described>”. Briefly, single intasome particles (244,315) 
were selected from the aligned and summed micrographs, from which 147,850 were 
used to create an initial raw particle stack after removing regions of the micrographs 
containing carbon and large areas of aggregation. Two-dimensional alignments 
and classifications were performed using the CL2D® and Relion”® algorithms 
(Extended Data Fig. 1c), and an initial model was generated directly from the 
class averages using OptiMod®” (Extended Data Fig. 1d). After iterative rounds 
of two-dimensional alignment and classification, 77,365 particles remained for 
three-dimensional refinement and classification. Three-dimensional refinements 
and classifications were initially performed within Relion”, after which the param- 
eters were converted for use in Frealign**. The final map was refined in Frealign. 
Several conformational states of the intasome were observed after three-dimen- 
sional classification in both Relion and Frealign*’. Whereas one of the resulting 
maps yielded the stable intasome structure from 41,475 particles (Fig. 2a, Extended 
Data Fig. 2c and Extended Data Table 2), all other maps (one of which is shown 
in Extended Data Fig. 3b) displayed mobility in the flanking regions, which did 
not resolve by further classifying the data. To improve the resolution of the core 
region, we ran Relion and recovered four models in the classification. For each of 
the resulting maps, the flanking regions were segmented and treated with a soft- 
edged mask that adopted the shape of the remaining density. Subsequently, for 
each raw particle, the flanking region from the respective conformational state to 
which that particle belonged was computationally subtracted from the raw particle 
image. The contrast transfer function was included in the computational sub- 
traction process. In this manner, data sets lacking most of the flanking INs were 
created. Refinement of the core intasome data set was then conducted using the 
likelihood-based approach in Frealign”, effectively a focused classification of the 
core region. The best class was resolved to ~4 A resolution in the most homoge- 
neous regions using 30,307 particles (Extended Data Fig. 2d and Extended Data 
Table 2). Although slight ghost images remained for the flanking regions within 
certain particles, they did not dramatically affect the refinement; the use of a tighter 
mask facilitated the recovery of higher-resolution information. 
Assembly of the atomic model. Models of the core intasome and the full octamer 
structures were built and refined in a stepwise manner using Rosetta! starting 
with rigid-body fitted X-ray structures of individual domains as input. Rosetta 
protocols were used for all parts of the modelling®. To optimally fit X-ray mod- 
els into the EM density, we first independently refined each individual domain 
(NTD, CCD and CTD) using multiple-input starting seeds. CCD, and CCD) were 
each seeded with six starting X-ray models: independent CCD monomers from 
chains A-D of the INccp structure and monomers A-B of the CCD portions of 
the INntp-_ccp structures. CTDs 1, 2, 5 and 6 were seeded with subunits A and 
B of the INcrp X-ray model. Likewise, for NTD, and NTD3, the two different 
NTDs of the INnrp-ccp X-ray structure were used as input seeds. All models 
were refined against the core intasome structure resolved to ~4-5 A resolution 
(Extended Data Fig. 2d). At least 2,000 models were generated from each and 
the lowest-energy model was selected for moving forward. Modelling quality was 
assessed by energy scores, structural similarity of the top scoring models and visual 
inspection (Extended Data Fig. 6a). We next proceeded to independently model 
IN;, INp, IN; and INg, thereby filling in the linker regions between individual 
domains using seven-amino-acid oligopeptides from the PDB’. This enabled 
de novo modelling for linker residues 45-54 between NTD,-CCD, and residues 
211-213 between CCD,-CTD, and CCD2-CTD; (some residues, as well as out- 
lier linker regions, were not modelled owing to disorder; Extended Data Fig. 6b, 
c); modelling was facilitated by the presence of ‘bumps’ within the density that 
corresponded to bulky amino-acid side chains, in particular within NTD|-CCD,, 
which is located in the best-resolved region of the structure (Extended Data Fig. 
2d). IN; and IN) were each seeded with combinations of the best models arising 
from refinement of individual domains and were subsequently refined against the 
core intasome density map. Two thousand models were again generated for each, 
and the best were selected to move forward. This set of procedures produced FL 
models for IN, and IN) and models for CTD; and CTD, fitted to the EM protein 
density. MMTV DNA was modelled on the basis of the X-ray structure of the PFV 
intasome (PDB accession number 3L2Q). This model was rigid-body docked into 
the EM density and then relaxed with Rosetta. The complete intasome model was 
iteratively relaxed with Rosetta and then adjusted manually using Coot*®. Several 
iterative rounds of refinement and inspection were performed using MolProbity” 
at the end of each round until a consensus model was obtained (Extended Data 
Fig. 6c, d and Extended Data Table 2). 
IN linker regions. Linker lengths for Extended Data Fig. 8b were assessed by 
aligning published or in-house generated IN sequence alignments against align- 
ments based on known domain structures” (Extended Data Fig. 4a). The follow- 
ing sequences were included: gammaretroviruses: Moloney murine leukaemia 
virus (GenBank accession number J02255.1), reticuloendotheliosis virus strain A 
(DQ237900.1), feline leukaemia virus (NC_001940.1); epsilonretroviruses: walleye 
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dermal sarcoma virus (NC_001867.1), walleye epidermal hyperplasia virus types 1 
and 2 (AF133051.1 and AF133051.2, respectively); spumaviruses: PFV (U21247.1), 
macaque simian foamy virus (NC_010819.1), spider monkey foamy virus 
(EU010385.1); lentiviruses: HIV-1 strain NL4-3 (U26942.1), HIV-2 strain ROD 
(X05291.1), simian immunodeficiency virus strain agm.tan-1 (U58991.1), equine 
infectious anaemia virus (M16575.1), feline immunodeficiency virus (M25381.1), 
caprine arthritis encephalitis virus (M33677.1), bovine immunodeficiency virus 
(NC_001413.1); deltaretroviruses: bovine leukaemia virus (K02120.1), human 
T-cell lymphotropic virus types 1 and 2 (NC_001436.1 and NC_001488.1, respec- 
tively); betaretroviruses: MMTV (NC_001503.1), Mason Pfizer monkey virus 
(NC_001550.1), Jaagsiekte sheep retrovirus (NC_001494.1); alpharetroviruses: 
Rous sarcoma virus (J02342.1), lymphoproliferative disease virus (KC802224.1). 
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Extended Data Figure 1 | Cryo-EM data and refinement. model from the class averages calculated using OptiMod*”. e, Refined 
a, Representative cryo-electron micrograph of MMTV intasomes, taken reconstruction from the full data set, with an Euler angle distribution plot 
at 2.7 |1m underfocus. b, Same as in a, marked to show selected particles. showing the relative orientations of the particles. 


c, Two-dimensional class averages calculated using Relion®. d, Initial 
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Extended Data Figure 2 | Cryo-EM resolution analysis of reconstructed 
intasome maps. a, Fourier shell correlation curve corresponding to the 
refined map generated from the full intasome data set. b, Fourier shell 
correlation curve corresponding to the refined map generated from the 
core intasome data set with the NTDs, CCDs and interdomain linker 
regions of the flanking IN dimers computationally subtracted. Average 
global resolutions in a and b are indicated. c, Refined map generated from 
the full data set (left) displayed side-by-side with the same map coloured 
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for local resolution (right). d, Refined map generated from the core 
intasome data set (left) displayed side-by-side with the same map coloured 
for local resolution (right) using the colouring scheme in c. e, Rotational 
snapshots of segmented density of CCD, with the fit of the refined model 
(see Extended Data Fig. 6) highlighting structural features evident at 
~4-5 A resolution. Partial separation of B-strands, which is typically 
evident at or beyond 4.5 A resolution, is apparent. 
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Extended Data Figure 3 | Structural heterogeneity of the MMTV the density map are displayed below. Multiple fuzzy regions in the flanking 
intasome. a, Stable structural conformation of the MMTV intasome after INs are apparent in b, which are indicative of remaining heterogeneity 
three-dimensional classification of the data. Slices from the density map within the data and/or continuous structural mobility of the region. 

are displayed below. b, One of several conformations of MMTV intasome c, Overlay of the two reconstructed maps, highlighting the extent of 
refinement after three-dimensional classification of the data. Slices from mobility within the flanking regions (brackets). 
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Extended Data Figure 4 | MMTV IN domains and intasome 
sedimentation coefficient distribution. a, Primary IN sequence 
alignment with boxes denoting canonical IN structural domains. 

The N-terminal extension domain occurs in spuma-, gamma- and 
epsilonretroviral IN proteins. Identical residues between MMTV, Rous 
sarcoma virus, HIV-1 and PFV INs are highlighted by red background; 
residues that are minimally conserved in three of the sequences are in red. 
PFV IN secondary structure elements are from PDB accession number 
3L2Q; MMTV elements are from the INnrp-ccp and INcrp crystal 
structures described here (PDB accession numbers 5CZ2 and 5D7U, 
respectively). Symbols a, 8, n, TT and TTT represent a-helix, 3-strand, 


310-helix, «-turn and (-turn, respectively. Figure generated with ESPript 
3.0 (ref. 61). b, Monte Carlo analysis of sedimentation velocity data for the 
higher loading concentrations of VDNA (green), MMTV IN (blue) and 
intasome (red). A clear shift to a discrete species at 10.5 is observed for 
the intasome, with minor IN and vDNA populations evident. Different 
centrifugation parameters for IN and vDNA versus intasomes (see 
Methods) probably attributed to the minor variations in sedimentation 
coefficient between major and minor IN and vDNA species. Measured 
sedimentation coefficients and calculated molar masses compared with 
theoretical molar masses are shown beneath the graph. 
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Extended Data Figure 5 | MMTV IN domain crystal structures. a, Stereo _ Active site residues are shown as red sticks. d, Cartoon representation 


view of the final 2F, — F- density map of the INccp crystal structure with of the INnrp-ccp dimer structure (one of three in the asymmetric unit). 
blue mesh contoured at 1o. Amino-acid side chains are readily evident at The NTD and CCD are coloured green and gold, respectively. Red 

the 1.7 A resolution. b, Stereo view of the final 2F, — F. density map of the sticks, active site residues; grey and green spheres, Zn** and Mg’* ions, 
2.7 A resolution INntp-ccp crystal structure with blue mesh contoured respectively. e, Stereo view of the final 2F, — F- density map of the 1.5 A 

at lo. The map is centred on the DDE catalytic triad (red sticks); green resolution INcrp crystal structure, shown as a green mesh contoured at lo. 
spheres, Mg** ions. c, Cartoon representation of the INccp monomer f, Cartoon representation of one of the two CTD monomers present in the 
(one of four in the crystallographic asymmetric unit) coloured in gold. asymmetric unit. 
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Extended Data Figure 6 | Molecular modelling of cryo-EM density. 

a, Stereo views showing comparisons between the starting X-ray domain 
models and refined cryo-EM domain models for IN, highlight relatively 
minor structural perturbations that are evident only in the most flexible 
regions of the intasome. b, Linker region snapshots. Atomic models were 
built de novo from the cryo-EM density for the indicated linkers in the top 
two panels (residues 45-54 connecting NTD; and CCD, and CCD-CTD 
residues 211-213). Linkers NITD,;-CCD 2, CCD;-CTD; and CCD¢-CTD¢ 
were not modelled, but are shown as cryo-EM density (red) in the lower 
panels. c, Stereo view of the cryo-EM model for the MMTV intasome core 
region (Extended Data Fig. 2d), generated using Rosetta’*"!”, All domains 


CCD (X-ray, CCD-NTD structure) 
CCD (EM-fitted) 
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were refined starting with the X-ray crystal structures (Extended Data 
Fig. 5). Specific linker regions were built de novo (continuous red lines) 
from the cryo-EM density, whereas lower-resolution linker regions (red 
dotted lines) were omitted from the model. d, Fourier shell correlation 
curve between the refined cryo-EM core intasome model and map, 
showing an average resolution of 4.8 A. e, Comparison of two NTD-CCD 
conformations in the intasome highlights the NTD-CCD linker, which 
assumes a retracted state in the outer IN, and IN, monomers of core 
intasome dimers A and B, respectively, as well as in flanking IN dimers C 
and D (left). The linker extends in core IN molecules IN; and IN3, which 
interact with the vDNA (right). 
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Extended Data Figure 7 | Gel filtration profiles of INw7 and IN mutant proteins. Elution profiles of mass standards in kilodaltons as well as theoretical 


protein monomer (M) and dimer (D) positions are indicated. 
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Extended Data Figure 8 | Comparisons of PFV and MMTV intasome of IN in known intasome structures is indicated by bold type. c, The PFV 
structures. a, Cartoon representations of the inner IN3 green subunits of intasome with bound tDNA (PDB accession number 3082; orange) was 
the MMTV and PFV intasomes (Fig. 3a; vVDNA strands are in grey). superimposed with the MMTV intasome (blue). The distance between 


CCD-CTD linker regions are highlighted in orange, and dashed lines circle overlaid active sites is in each case ~26 A. d, Ninety-degree rotation of 
analogously positioned CTDs. Of note, this CTD in the MMTV structure superimposed structures, with proteins omitted for clarity. Canonical 


is coloured differently because it originates from a separate IN molecule B-form tDNA (magenta) was superimposed with PFV intasome tDNA. 
(INg from flanking dimer D). b, Lengths of NTD-CCD and CCD-CTD The positions of phosphodiester bonds staggered by 4 bp in the PFV 
interdomain linker regions across retroviral IN proteins; ‘+’ indicates the crystal structure or by 6 bp in the modelled tDNA are indicated by spheres. 


presence of an N-terminal extension domain (NED). The multimeric state 
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Extended Data Table 1 | X-ray crystallography data collection and refinement statistics 


Construct CCD NTD-CCD CTD 

Data collection 

Space group P1 P12,1 C222, 

Cell dimensions 
a, b, c (A) 51.89, 53.71, 69.65 54.37, 83.15, 141.14 35.99, 42.28, 139.09 
a,b, g (°) 69.69, 82.08, 63.97 90, 90.19, 90 90, 90, 90 


Resolution (A)* 


46.6 - 1.70 (1.73 - 1.70) 


70.6 - 2.72 (2.79 - 2.72) 


40.4 - 1.50 (1.53 - 1.50) 


Rmerge 0.060 (0.57) 0.08 (0.534) 0.043 (0.585) 
Ils} 21.0 (2.0) 9.5 (2.0) 29.2 (3.8) 
Completeness (%) 99.1 (95.6) 99.3 (99.0) 99.8 (99.9) 
Redundancy 5.2 (2.8) 3.2 (3.1) 12.2 (8.9) 
Refinement 
Resolution (A) 32.8 - 1.70 70.6 - 2.72 40.4 - 1.50 
No. reflections used 69,075 32,115 17,448 
Roa! Pie 0.189/0.222 0.245/0.266 0.165/0.202 
No. atoms 

Protein 4,983 9,110 890 

Ligand/ion 0 12 8 

Water 437 0 69 
B-factors 

Protein 26.0 70.9 28.5 

Ligand/ion - 45.6 46.4 

Water 33.5 - 46.9 
R.m.s deviations 

Bond lengths (A) 0.007 0.010 0.005 

Bond angles (°) 0.954 1.281 0.911 


*Data for the highest resolution shells are given in parenthesis. 
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Extended Data Table 2 | Cryo-EM data statistics 


Construct 
EM data collection/processing 


core MMTV intasome 


full MMTV intasome 


Microscope Titan Krios Titan Krios 
Voltage 300 300 
Camera Gatan K2 Summit Gatan K2 Summit 
Defocus range (um) 1.0-4.0 1.0-4.0 
Defocus mean + std (um) 2.6+40.6 2.6+0.6 
Exposure time (s) 10 10 
Dose rate (e-/pixel/s) 6.9 6.9 
Total dose (e-/A’) 40 40 
Pixel size (A) 1.31 1.31 
Number of micrographs 2,714 2,714 
Number of particles (processed) 147,850 147,850 
Number of particles (refined) 77,365 77,365 
Number of particles (in final map) 30,307 41,475 
Symmetry C2 C2 
Resolution (global) (A)* 4.8 6.0 
Resolution range (local) (A) 4-5 5-6 
Map sharpening B-factor (A’) -300 -460 
Model refinement 
Space group P1 - 
Cell dimensions 

a=b=c(A) 151.2 - 

a=b=g (°) 90 - 
Number of atoms (modeled) 11,462 - 
Validation 
MolProbity score 1.46 (96" percentile) - 
Clashscore, all atoms 2.27 (99" percentile) s 
Protein 

Ramachandran favored (%) 1,115 (92.76) - 

allowed (%) 87 (7.24) - 
Disallowed (%) 0 (0) - 

Good rotamers (%) 1,035 (99.71) - 

CB deviations >0.25A (%) 0 (0) - 

Cis Prolines (%) 8 / 88 (9.09) - 

Bad bonds (%) 2/ 10,140 (0.02) - 

Bad angles (%) 3/ 13,810 (0.02) - 
DNA 

Bad bonds (%) 0/ 1,834 (0) - 

Bad angles (%) 1 / 2,822 (0.04) - 
r.m.s. deviations 

Bond lengths (A) 0.012 - 

Bond angles (°) 1.334 - 


*Resolution assessment based on frequency-limited refinement using the 0.143-threshold for resolution analysis. 
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Crystal structure of the Rous sarcoma virus intasome 


Zhiqi Yin!*3*, Ke Shi'***, Surajit Banerjee*, Krishan K. Pandey®, Sibes Bera, Duane P. Grandgenett® & Hideki Aihara! 


Integration of the reverse-transcribed viral DNA into the host 
genome is an essential step in the life cycle of retroviruses. Retrovirus 
integrase catalyses insertions of both ends of the linear viral DNA 
into a host chromosome!. Integrase from HIV-1 and closely related 
retroviruses share the three-domain organization, consisting of 
a catalytic core domain flanked by amino- and carboxy-terminal 
domains essential for the concerted integration reaction. Although 
structures of the tetrameric integrase-DNA complexes have been 
reported for integrase from prototype foamy virus featuring an 
additional DNA-binding domain and longer interdomain linkers”, 
the architecture of a canonical three-domain integrase bound to 
DNA remained elusive. Here we report a crystal structure of the 
three-domain integrase from Rous sarcoma virus in complex 
with viral and target DNAs. The structure shows an octameric 
assembly of integrase, in which a pair of integrase dimers engage 
viral DNA ends for catalysis while another pair of non-catalytic 
integrase dimers bridge between the two viral DNA molecules 
and help capture target DNA. The individual domains of the eight 
integrase molecules play varying roles to hold the complex together, 
making an extensive network of protein-DNA and protein-protein 
contacts that show both conserved and distinct features compared 
with those observed for prototype foamy virus integrase. Our 
work highlights the diversity of retrovirus intasome assembly and 
provides insights into the mechanisms of integration by HIV-1 and 
related retroviruses. 

Integrases (INs) from lentiviruses including HIV-1 and the 
phylogenetically closely related alpharetroviruses including avian Rous 
sarcoma Virus (RSV) share the conserved three-domain organization 
consisting of the Zn**-coordinating amino (N)-terminal domain 
(NTD), the catalytic core domain (CCD), and the 8-strand-rich car- 
boxy (C)-terminal domain (CTD) (Fig. 1a and Extended Data Fig. 1). 
We used the three-domain RSV IN construct biochemically fully active 
in concerted integration® and a branched DNA substrate mimicking 
the product of the concerted integration reaction’ (Fig. 1b) to assem- 
ble and crystallize the stable RSV intasome complex (Extended Data 
Figs 2 and 3). The crystallized RSV intasome showed in solution an 
apparent molecular mass of 255 kDa by size-exclusion chromatogra- 
phy (SEC) and 240 (£10) kDa by SEC with multi-angle light scattering 
(SEC-MALS) analysis, larger than the expected mass of 168 kDa for a 
tetramer of RSV IN(1-270) bound to the branched DNA (Extended 
Data Fig. 4a—c). The structure of the RSV intasome was determined 
by molecular replacement phasing and refined to 3.8 A resolution 
(Extended Data Table 1). 

In contrast to the general assumption that retrovirus intasomes consist 
of an IN tetramer, the RSV intasome structure shows that it contains 
eight IN molecules (Fig. 1c-f and Supplementary Video), clearly 
supported by the selenium anomalous difference Fourier peaks for 
a crystal grown with selenomethionine-labelled IN (Extended Data 
Fig. 5). The observed octameric assembly (288 kDa) is consistent with 
the larger apparent molecular mass of RSV intasome in vitro and chemi- 
cal cross-linking analysis (Extended Data Fig. 4d). The RSV IN octamer 


contains a core tetramer, which consists of two sets of ‘proximal’ IN 
dimers, and two additional sets of ‘distal’ IN dimers. The CCDs of the 
four IN dimers are positioned in an approximately twofold symmet- 
rical arrangement that resembles a parallelogram (Fig. 1c and d). The 
proximal IN dimer consists of ‘inner’ and ‘outer’ IN subunits, where 
the active site of the inner subunit accommodates the viral/target DNA 
junction (Fig. 2a-c). The NTD of each inner IN subunit interacts in 
trans with the viral DNA engaged by the opposing proximal IN dimer, 
with the extended linker between NTD and CCD traversing the grooves 
of both viral DNA molecules to make additional contacts (Figs 2a and 
3a-c). The domain-swapped arrangement of NTD and CCD of the 
proximal IN dimers is analogous to that observed in the tetrameric 
prototype foamy virus (PFV) intasome’. 

However, despite the shared structural features that support the basic 
chemistry of integration, the RSV and PFV intasomes have very differ- 
ent architectures (Fig. 3 and Extended Data Fig. 6). The PFV intasome 
consists of four IN molecules, corresponding only to the core tetramer 
(two proximal IN dimers) of the RSV intasome. In the PFV intasome 
the inner IN subunits make all DNA interactions, while only the CCD 
is ordered for the outer subunits (Fig. 3d-f). In the RSV intasome, 
CTDs of the inner and outer subunits of the proximal IN dimer take 
unique conformations and tightly associate with each other (Fig. 2c), 
and this CTD dimer makes viral DNA contacts both in cis and in trans 
(Fig. 2a). Moreover, the octameric RSV intasome contains two addi- 
tional IN dimers; the CTDs from the distal IN dimers bridge between 
the proximal IN dimers, making additional contacts with both viral 
DNAs analogously to the CTD of the catalytic IN subunit in the PFV 
intasome (Fig. 3b, e). The CCDs of the distal IN dimers, anchored to the 
core of the RSV intasome through these CTD interactions (Fig. 2a), are 
positioned at the outer corners of the parallelogram, loosely associated 
with the distal regions of target DNA through non-specific interactions 
(Figs 1d-f and 2a). The six remaining NTDs from the outer subunits 
of the proximal IN dimers and both subunits of the distal IN dimers 
are bound intra-molecularly to CCD (Extended Data Fig. 7c, d). In 
total, over 10,000 A? of molecular surface is buried in IN-IN interfaces 
within the RSV intasome, approximately half of which is accounted 
for by the conserved CCD dimerization interface, and ~6,000 A? in 
IN-DNA interfaces. 

Although the CCDs and CTDs from all four IN dimers within the 
RSV intasome individually self-dimerize in the same fashions, the 
relative positioning of the CTDs with respect to the CCDs differs 
between the proximal and distal IN dimers, corresponding to their 
different roles. The CCD-CTD configuration for the proximal IN 
dimers is very similar to that previously observed for DNA-free RSV 
INs®8 (Fig. 2c), suggesting that RSV IN dimer in its native conforma- 
tion is poised for viral DNA binding and catalysis, as noted earlier®. 
The CTD of the inner catalytic subunit of the proximal dimer binds to 
viral DNA near the viral/target junction but on the opposite face of the 
double-stranded DNA, while it also makes contacts in trans with 
the other viral DNA (Fig. 2a, b). The CTD of the outer subunit of 
the proximal dimer binds to the distal region of the viral DNA in cis. 
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Figure 1 | Overall structure of the RSV intasome. a, Comparison of the 
domain organization between RSV, HIV-1, and PFV INs. b, Branched 
DNA structure mimicking the product of the concerted integration 
reaction used in assembling the RSV intasome. c, Structure of the RSV 
intasome, viewed along its pseudo-twofold axis from the viral DNA side. 
The three structural domains of IN are colour coded as in a. The grey 


Unlike in the proximal IN dimer, the CCD-CTD configuration for the 
distal IN dimers shows deviation from the canonical conformation, 


b 


Inner IN 
(catalytic 


Figure 2 | Proximal and distal IN dimers. a, View highlighting 
interactions between a proximal (green/cyan) and a distal (slate/orange) 
IN dimers and their interactions with DNA. The IN and DNA strands are 
coloured as in Fig. 1d-f. b, Close-up view around the viral DNA terminus. 
The CCD loop (residues 144-153) centred around Ser150 of the catalytic 
IN subunit is coloured in violet. The catalytic triad residues (D64, D121, 
E157) are shown in red sticks in a-c. c, d, Superposition of the proximal IN 
dimer (c; green/cyan) or the distal IN dimer (d; slate/orange) on the RSV 
IN CCD-CTD dimer in its native conformation (PDB accession number 
4FW1,; light grey)®. The grey spheres represent zinc ions in NTD. 


IN (catalytic) 
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spheres represent zinc ions in NTD. Two subunits within each IN dimer 
are coloured slightly differently from each other. d-f, Structure of the 
RSV intasome with the eight IN molecules coloured individually, shown 
in three orthogonal orientations. The protein surface is shown in d and e. 
The DNA strands are colour coded as in b. 


which can be described by a swing of the CTDs relative to the CCDs 
and disruption of the parallel }—-sheet-like conformation of the CCD- 
CTD linkers® (Fig. 2d). This alternative CCD-CTD orientation allows 
the distal IN dimers to fit in the intasome without making steric 
clashes with the proximal INs or the 5’ overhang of the viral DNA 
strand (Fig. 2a). The CCDs of the distal IN dimers may be positioned 
similarly in the absence of target DNA, serving as a platform for target 
DNA capturing. 

The asymmetrically associated CTDs of the proximal and distal IN 
dimers further interact with each other and with the NTD of the cat- 
alytic IN subunit to crosslink between the two viral DNAs (Figs 2a 
and 4a, b). Each CTD dimer interacts with both of the viral DNA 
molecules at different positions, resulting in four distinct DNA-binding 
modes of individual CTDs. The basic amino acids Arg227, Arg244, 
Arg263, and Lys266 from various CTD monomers make contacts with 
the viral DNA molecules, and each viral DNA is sandwiched between 
separate CTD dimers (Fig. 4c). Arg244 of the inner catalytic subunit of 
the proximal IN dimer is positioned in the major groove of viral DNA, 
closest to G7 of the non-transferred strand. The GC pair at this posi- 
tion is critical for concerted integration by RSV IN’. The correspond- 
ing residue Glu246 of HIV-1 IN was shown by disulfide cross-linking 
studies to interact with A7 of the non-transferred strand", suggesting 
similar modes of viral DNA sequence recognition by CTD between 
RSV and HIV-1 INs. Both RSV R244A/C and HIV-1 E246A IN mutants 
show reduced 3’ processing and strand-transfer activities°!'!, possibly 
reflecting the importance of these residues from various IN subunits. 
Mutation of conserved Trp233, which is stacked between the Arg227 
and Lys266 side chains, to Glu or Ala but not Phe, abolishes binding to 
the viral DNA long terminal repeat sequence and concerted integration 
by RSV IN’. The corresponding HIV-1 IN mutations W235E/A/F have 
parallel effects on concerted integration activity and virus replication, 
suggesting the importance of an aromatic residue at this position in 
orienting the basic side chains'*">. Similarly, mutation of Trp259 buried 
in the CTD dimer interface (Fig. 4b)°* as well as involved in multiple 
interactions near the viral DNA 5’ end (Fig. 2b) abolishes all enzymatic 
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Figure 3 | Comparison between the RSV and PFV intasomes. a, b, The 
octameric RSV intasome. The eight IN molecules and the DNA strands are 
coloured as in Fig. 1d-f. The catalytic triad (DDE) residues of the inner 
subunit of the proximal IN dimers are shown in red in b. c, Conformation 
of the inner catalytic IN subunit in the RSV intasome. The protein chain is 
coloured in a gradient of blue to red from N to C termini, respectively. The 
NTDs and CCDs of the distal IN dimer are omitted. d, e, The tetrameric 
PFV intasome*. The colour scheme follows that used for the proximal 

IN dimers of RSV intasome in a and b. The direction of DNA helical 

axis between the two integration sites on opposing strands (6 bp for RSV 
and 4 bp for PFV) differs significantly between the RSV (b) and PFV (e) 
intasomes. f, The inner catalytic IN subunit in the PFV intasome, coloured 
in a gradient of blue to red as in c. Note the presence of an extra domain on 
the N terminus (NED) and different positioning of CTD compared with 
RSV IN inc. 


activities of RSV IN®"®, reflecting the important roles of this residue 
in engaging viral DNA. 

The CCD of the catalytic RSV IN molecules engages viral and 
target DNAs primarily through interactions in their minor grooves, 
as observed for PFV IN”. The long «-helix («7; residues 154-174) 
that harbours one of the catalytic triad of metal-coordinating residues, 
Glu157, inserts into a significantly widened minor groove near the 
viral DNA terminus, where Arg158, Arg161, and Lys164 side chains 
make DNA base or backbone contacts (Fig. 4d and Extended Data 
Fig. 7h, i). The preceding loop centred around Ser150, which is highly 
flexible in the DNA-free IN°*”, is positioned between the viral DNA 
5’ overhang and the 3 end of the cleaved target DNA, displacing the 
T3 base opposite the terminal adenine (A20) of the transferred strand 
(Fig. 2b). DNA contacts in this minor groove also include hydrogen 
bonding between Thr66 and the backbone phosphate group of the ter- 
minal nucleotide (A20) of the transferred viral DNA strand (Fig. 4d). 
Mutations of the corresponding HIV-1 IN residue T66A/I/K confer 
resistance against the IN strand-transfer inhibitors (INSTIs; http:// 
hivdb.stanford.edu/DR/INIResiNote.html), probably as a result of sub- 
tle changes of the IN-viral DNA interface. The NTD of the opposing 
catalytic subunit, contributed in trans, binds in the adjacent major 
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intasome similar to Fig. 3a. b, A crown-like structure formed by the CTD 
of all eight INs and NTD and CCD of the catalytic INs, which encircles 
the two viral DNA molecules in black (boxed region in a). Trp259 side 
chains buried in the interface of each CTD dimer are shown. c, Viral DNA 
contacts by the CTDs. The side chains of Arg227, Arg244, Arg263, and 
Lys266 are shown. NTD and CCD are omitted. d, Close-up view around 
the viral DNA end, showing interactions by CCD of a catalytic subunit, 
NTD of the opposing catalytic subunit contributed in trans, and the CTDs 
of a distal IN dimer. The catalytic triad residues are shown in red sticks. 
The grey sphere represents a zinc ion in NTD. 


groove and places Arg17 and Arg31 side chains for potential base- 
specific contacts (Fig. 4d). The hydrophobic NID-CCD interface is 
centred on Phe199 from CCD, which explains why the F199K muta- 
tion selectively abolishes concerted integration by RSV IN®!®. The 
CTDs of the distal IN dimer further extend the viral DNA interactions 
in this region (Fig. 4d). 

The three flipped-out terminal nucleotides at the 5’ end of the 
non-transferred viral DNA strand, including the two overhanged bases 
and the displaced T3, bridge between CCD of the catalytic IN subunit 
anda CTD from the distal IN dimer, rather than between CCD and 
CTD of the same molecule as observed in the PFV intasome (Figs 2b, 
4b and 5a). Arg263 from this CTD points towards the non-transferred 
strand G4 opposite the CA dinucleotide at the viral DNA terminus 
(Fig. 2b), which could be relevant to the effect of mutation of the cor- 
responding HIV-1 residue Arg263 on catalysis and drug resistance’®. 
The two viral DNAs in the RSV intasome branch out with their helical 
axes skew and at an angle of ~60°, which is smaller than the viral DNA 
split of ~80° in the PFV intasome (Extended Data Fig. 8). Accordingly, 
the viral DNA molecules in the RSV intasome are positioned closer 
to each other than those in the PFV intasome, with the backbone 
phosphate oxygen atoms at the closest point ~5 A apart. The viral DNA 
molecules in the RSV intasome are surrounded by a highly positively 
charged surface formed by a network of CTDs and NTDs, which may 
alleviate potential electrostatic repulsions between DNA strands and 
help hold the complex together (Fig. 4a-c). 

The target DNA in the RSV intasome shows a strong overall bending 
of ~90° away from the core of the complex (Fig. 5a and Extended Data 
Fig. 7b), which may help prevent the reversal of integration as noted 
for the PFV intasome and related transpososome structures*!*. The 
bent conformation is stabilized by the minor groove contacts made 
by CCD of the catalytic subunits near the viral/target DNA junction, 
which include insertion of a short helix (a5) harbouring Ser124, a 
residue important in target DNA capturing’. Localized kinks at the 


© 2016 Macmillan Publishers Limited. All rights reserved 


Figure 5 | Target DNA. a, The sharply bent target DNA spanning 

the CCDs of the catalytic IN subunits. Ser124 and Glu229 side chains 
potentially involved in target DNA contacts are shown. b, The central 
6 bp region viewed from the viral DNA side, showing distorted DNA 

conformation with severely compromised base-stacking. c, The target 
DNA viewed from the viral DNA side, highlighting a large shift in the 
helical axis. 


viral/target DNA junctions cause the DNA trajectory also to zigzag 
in the plane perpendicular to the direction of the primary bending, 
creating a ~20A shift in the helical axis with an overall positive writhe 
(Fig. 5c). As a result, the target DNA conformation in the RSV intasome 
deviates significantly from that in the PFV intasome? (Extended Data 
Fig. 8c) or the DNA structure in nucleosomes”’. This suggests that 
RSV integration into nucleosomes would require a large conforma- 
tional change in the target DNA, probably more extensive than that 
observed in the PFV intasome-nucleosome complex’. 

The active sites of the catalytic IN subunits in the RSV intasome 
are separated by ~27 A, larger than the distance of ~21 A across the 
major groove between the backbone phosphates 6 base pairs (bp) 
apart in a B-form DNA. Accordingly, the 6 bp spacer region of target 
DNA is underwound and shows severely compromised base-stacking 
in addition to widening of the major groove (Fig. 5b and Extended 
Data Fig. 9). In particular, the central dinucleotide step shows a dis- 
torted conformation where the unstacked nucleobases stack on the 
deoxyribose moiety of the adjacent nucleotide; the unique conforma- 
tion may explain the modest target sequence preference in RSV integra- 
tion?**3, A CTD from each distal IN dimer has a loop (between 86 and 
87) positioned over the major groove of the target DNA in this region, 
with the Glu229 side chain positioned for potential base contacts 
(Fig. 5a). This loop from various CTDs is involved in viral DNA con- 
tacts, interactions between NTD and CTDs, and those linking the 
CTDs from the proximal and distal IN dimers in the RSV intasome 
(Fig. 4b and Extended Data Fig. 7e-g). 

The RSV intasome structure shows a novel architecture of 
IN-DNA complex and highlights a remarkable diversity among 
retrovirus intasome assemblies. In particular, the structurally con- 
served CTD plays critical roles in the octameric RSV intasome dis- 
tinct from those played by CTD in the tetrameric PFV intasome**”. 
PFV IN features a unique ~30-amino-acid insertion between CCD 
and CTD (Fig. la and Extended Data Fig. 1), and the resulting 
long interdomain linker not only makes direct viral DNA contacts 
but also allows CTD of the catalytic IN subunit to fit between the 
NTD and CCD from the same molecule to interact with both viral 
DNAs (Fig. 3f). In case of RSV IN with a much shorter CCD-CTD 
linker, the NTD and CCD of the catalytic IN molecule is instead 
bridged by the CTDs contributed in trans from the distal IN dimer 
(Fig. 3c). Analogously, the extra ~50 amino acids on the N terminus 
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of PFV IN constitute an independent DNA-binding domain (NED), 
and its interaction with the distal region of the viral DNA substi- 
tutes for viral DNA interactions made by the multiple copies of 
CTDs as observed in the RSV intasome (Fig. 3a, d). The larger 
(408 amino acids) IN from a gammaretrovirus murine leukaemia 
virus also has both the NED anda long CCD-CTD linker”, consistent 
with the idea that these two structural features have complementary 
functions. The different modes of viral DNA engagement by either 
tetrameric or octameric IN, using similar sets of structurally conserved 
domains, suggest divergent evolution of the integration machineries. 
Of note, an octameric intasome assembly similar to that reported here 


has recently been observed by cryo-electron microscopy for another 


three-domain IN from mouse mammary tumour virus”. 


The integration of retroviral DNA into the host genome was first 
postulated on the basis of the observation of RSV-infected cells”°, and 
identification and characterization of IN from retroviruses including 
RSV have provided a basic understanding of retrovirus integration 
and foundation for studying HIV-1 IN?7-°°. Our findings on the RSV 
intasome structure shed light on the previously unappreciated diver- 
sity in retroviral DNA integration and define a new framework for 
future studies of retrovirus IN. Availability of the intasome structures 
from multiple retrovirus systems will also allow more accurate model- 
ling of the HIV-1 IN-DNA interactions to help development of novel 
antiretroviral drugs, including IN inhibitors that target outside the most 
conserved active-site features. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


No statistical methods were used to predetermine sample size. The experiments 
were not randomized. The investigators were not blinded to allocation during 
experiments and outcome assessment. 

RSV intasome preparation. RSV IN and its mutant forms were overexpressed in 
Escherichia coli BL21 (DE3) and purified as previously described®. The proteins 
were stored in aliquots at —80°C in 20mM HEPES-NaOH (pH 7.5), 1.0M NaCl, 
20M ZnCh, 5mM 2-mercaptoethanol, and 10% (w/v) glycerol. The branched 
DNA substrate mimicking the product of the concerted integration reaction 
was obtained by annealing three synthetic oligonucleotides (Integrated DNA 
Technologies). A similar strategy had been used previously to prepare a PFV 
IN-DNA complex, which demonstrated that the IN-DNA complex assembled 
on the designed integration product has essentially an identical structure to the 
equivalent complex formed via the forward integration reaction’. The viral DNA 
branches carrying the high-affinity gain-of-function mutant RSV U3 long ter- 
minal repeat sequence (GU3)*! were attached to the target DNA duplex with a 
palindromic 6 bp spacer to generate a fully symmetrized structure. To prepare the 
RSV IN-DNA complexes for crystallization, 301M half-site DNA substrate was 
mixed with 120,.M RSV IN in 20mM HEPES-NaOH (pH 7.5), 500 mM NaCl, 
20% (w/v) glycerol, and 1 mM tris-(2-carboxyethyl)phosphine (TCEP). The mix- 
ture was dialysed against low salt buffer (20 mM HEPES-NaOH pH 7.5, 125 mM 
NaCl, 20% (w/v) glycerol, 1mM TCEP) at 25°C overnight. At the end of this first 
dialysis, essentially all IN and IN-DNA complex precipitated. The mixture was 
subsequently dialysed against high salt buffer (20 mM MES-NaOH pH 6.0, 1.2M 
NaCl, 20% (v/v) dimethyl sulfoxide (DMSO), 5% (w/v) glycerol, 1 mM TCEP) at 
25°C for 2-4h. At the end of this second dialysis, the reaction mixture became 
clear. The solubilized RSV IN-DNA complex (intasome) was purified through 
SEC (Superdex 200 10/300 GL, GE Healthcare) running with 20 mM MES-NaOH 
pH 6.0, 1.2 M NaCl, 20% (v/v) DMSO, 5% (w/v) glycerol, and 1mM TCEP. The 
isolated intasome remains stable in the high-salt condition containing 1.2 M NaCl 
that precludes complex formation, suggesting that once the intasome forms, it is 
kinetically trapped. The solubility-enhancing RSV IN mutation F199K completely 
abolished intasome formation. For the SEC-MALS analysis, a modified condition 
was used for the intasome re-solubilization and isolation for better baseline stability 
(Extended Data Fig. 4c). 

Crystallization. For crystallization, the RSV intasome assembled through dialysis 
and purified by SEC was concentrated to 4-6 mg ml using a centrifugal concen- 
trator (Amicon). Various combinations of the lengths of the viral DNA (ranging 
from 16 bp to 25 bp) and flanking target DNA (ranging from 14 bp to 22 bp, corre- 
sponding to the full-target DNA lengths of 34-50 bp) were screened in crystalli- 
zation trials. The extensive screening yielded only one crystal form with a specific 
combination of 22-base (length of the non-transferred strand) viral DNA branches 
and 16 bp target DNA flanks on either side of the central 6 bp spacer (Fig. 1b and 
Extended Data Fig. 2). DNA substrates with two slightly different target sequences 
were used (5‘-AATGTTGTCT TATGCAATACTC-3//5’-GAGTATTGCATAAGA 
CAACAGTGCACGAATCTTGAAGACACT-3//5/-AGTGTCTTCAAGATTC-3’, 
or 5’/-AATGTTGTCTTATGCAATACTC-3//5/-GAGTATTGCATAAGACAACA 
GTCGACCAACCTTCAACTTAGC-3//5/-GCTAAGTTGAAGGTTG-3’), and 
they produced essentially the same crystals. The RSV intasome crystals were 
grown through reverse vapour diffusion in hanging drops at 22°C, by mixing 
1.51 IN-DNA complex solution with 1.5 11 reservoir solution (3.2 M sodium for- 
mate). Crystals appeared within 3-5 days and reached a size of ~150-300 1m in 
3-5 days. Even though the RSV intasome crystals initially diffracted X-ray poorly 
(~10 A), soaking the crystals with a metatungstate cluster compound dramat- 
ically improved the resolution (Extended Data Fig. 3b, c). The tungsten cluster 
was later found to bind between IN dimers from separate intasome complexes to 
mitigate crystal lattice disorder. The crystals were soaked overnight with 0.15 mM 
metatungstate cluster Nag[H2W 20,9] in a stabilization buffer consisting of 3.2 M 
sodium formate, 16 mM MES-NaOH pH 6.0, 0.8 M NaCl, 16% (v/v) DMSO, 4% 
(w/v) glycerol, and 1mM TCEP. After soaking, the crystals were cryo-protected 
in 3.2 M sodium formate, 16 mM MES pH 6.0, 0.8 M NaCl, 16% (v/v) DMSO, 12% 
(w/v) glycerol, and 1mM TCEP, and were frozen by rapid immersion in liquid 
nitrogen. The full-length wild-type RSV IN(1-286), the C-terminally truncated 
RSV IN(1-270), and its various mutant forms tested produced essentially the same 
crystals with indistinguishable X-ray diffraction properties. 

Structure determination. X-ray diffraction data were collected at the Advanced 
Photon Source Northeastern Collaborative Access Team beamlines (24-ID-C/E) 
and the Advanced Light Source Molecular Biology Consortium (4.2.2) beamline 
and processed using HKL2000 (ref. 32) or XDS*3. The RSV intasome crystals 
showed varying degrees of pseudo-merohedral twinning with twin operator 
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[L, -k, h], owing to the very similar a and c unit cell dimensions of the primitive 
monoclinic lattice. Thus, we screened a large number of crystals to identify ones 
that diffracted to higher resolution and had smaller twin fractions. The structure 
of RSV intasome was determined by molecular replacement with PHASER™, using 
the RSV IN CCD, CTD (PDB accession number 4EW1)°, and a 16 bp B-form 
DNA as search models. Eight copies of CCD, one copy of CTD, and three copies 
of DNA molecules were located. Refinement of the partial model revealed electron 
density for two copies of the metatungstate clusters. The metatungstate clusters 
were placed into the electron density by molecular replacement using MOLREP». 
Subsequent iterative model building using COOT™ and refinement with PHENIX 
suite*” allowed placement of the remaining seven copies of CTD, eight copies of 
NTD generated using MODELLER®* on the basis of HIV-1 NTD (PDB accession 
number 1K6Y)*?, and building of the inter-domain linkers as well as the remain- 
ing parts of the DNA molecule guided by the difference electron density maps. A 
third metatungstate cluster, more weakly bound than the first two, was positioned 
manually into residual density. The DNA base pairs and base-stacking restraints 
were used throughout the refinement. The geometry restraints for protein included 
the reference-model restraints for CCD and CTD based on the higher-resolution 
RSV IN structure (4FW1)°, and the secondary structure and zinc-coordination 
restraints for NTD. Atomic displacement parameters were refined with grouped 
B-factors per residue for protein and DNA, and a total of 54 TLS groups assigned 
by PHENIX””. Twelve tungsten atoms representing each cluster were refined as a 
rigid body. The asymmetric unit of the crystal contains one complete RSV intas- 
ome, which includes eight IN molecules and two viral DNA branches emanating 
from a strongly bent 38 bp target DNA. The data set used for the final refinement 
was from an RSV intasome crystal grown using selenomethionine-labelled RSV 
IN(1-270) with the following amino-acid substitutions: C23S, L112M, L135M, 
L162M, L163M, L188M, and L189M, which we confirmed to be active in concerted 
integration and inhibited by INSTI similarly to the wild-type RSV IN, and a DNA 
substrate carrying a nick at the middle of each target DNA branch (the 16-base 
DNA strand shown in olive in Fig. 1b has a nick 8 bases from either end). The nick 
occasionally facilitated crystal growth, but it was not necessary for crystallization 
and did not change the space group or the unit cell parameters compared with 
crystals grown without the nick. Because this nick in the target DNA is not bio- 
logically relevant, it is not shown in Figs 1b, d-f, 3b and 5a, c to avoid confusion. 
Twin refinement protocol was not used as the data set used for the final refinement 
had a low (<10%) twin fraction. The summary of data collection and refinement 
statistics is shown in Extended Data Table 1. The paired-refinement procedure*” 
was performed in steps of 0.1 A to determine the high-resolution limit (Extended 
Data Fig. 3f). The register of amino acids in the final model was verified by the sele- 
nium anomalous difference Fourier peaks (Extended Data Fig. 5). Ramachandran 
analysis shows that 96.0, 3.9, and 0.1% of the protein residues are in the most 
favoured, allowed, and disallowed regions, respectively. The NTD-CCD linker in 
some of the non-catalytic IN molecules and the last four base pairs at the distal end 
of one of the viral DNA molecules were not built owing to poor electron density. 
The molecular graphics images were produced using PYMOL (www.pymol.org). 
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Extended Data Figure 1 | Amino-acid sequence alignment of RSV, HIV-1, and PFV INs. The secondary structure elements for RSV IN are colour 
coded on the basis of the three IN domains similarly to Fig. la. The residue numbering at the top is for RSV IN. For each IN, the black dots mark every 
ten residues. This figure was made using ESPript*’. 
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Extended Data Figure 2 | DNA substrate used for assembling the mimicking the product of the concerted integration reaction (Fig. 1b). 
RSV intasome. a, The half-site (gapped duplex) substrate prepared by b, DNA structure in the RSV intasome. Viral DNA nucleotides are 
annealing three oligonucleotides dimerizes via the self-complementary numbered, and some of the structural elements of RSV IN involved in the 
six-base spacer sequence (underlined) to form a branched structure viral DNA interactions are shown. 
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Extended Data Figure 3 | RSV intasome crystal. a, A crystal of the RSV to the twofold screw (b) axis of the monoclinic lattice, which lies 
intasome. b, X-ray diffraction pattern from a crystal not treated with horizontally. f, Paired-refinement analysis*’ to assess the resolution limit 
metatungstate. c, X-ray diffraction pattern after the metatungstate-soaking — of the RSV intasome diffraction data. For each pairwise comparison, 
(see Methods for details). d, e, Lattice contacts within the RSV intasome model refinements were run at two different resolution limits and the 
crystal. DNA strands are coloured as in Fig. 1b, while IN subunits from R-factors calculated for a common (lower) resolution cutoff were 
only one intasome are coloured. The unit cell is shown in green. The small — compared. Inclusion of data beyond 3.7 A in the refinement compromised 
blue spheres represent tungsten atoms. The view in e is perpendicular model quality. 
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Extended Data Figure 4 | Biochemical characterization of RSV 
intasome. a, Representative SEC profile for RSV intasome, overlaid with 
that for a mixture of molecular mass markers. The buffer condition was 

as mentioned in the methods. b, SEC profiles for RSV intasomes formed 
with IN of varying C termini. c, SEC-MALS analysis of RSV intasome. The 
intasome formed with RSV IN (1-269 amino acids) was separated by SEC 
in a modified condition containing 20 mM HEPES pH 7.5, 1.0 M NaCl, 5% 
glycerol, and 1.0mM TCEP. The absolute molecular mass was determined 
by light scattering using in-line detectors described previously*”. The mass 
profile for the intasome is shown in red across the peak. The molecular 
mass of RSV intasome was 240 + 10kDa (n= 4). A similar SEC-MALS 
analysis of the intasome formed with wild-type full-length RSV IN (1-286 
amino acids) yielded a molecular mass of 268 kDa (n = 2, data not shown). 
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The calculated mass of an intasome containing eight RSV IN (1-269 or 
270) molecules is ~288 kDa. d, Chemical cross-linking analysis of RSV 
intasome. The RSV intasome and free IN (1-269 amino acids) were 
purified by SEC in the running buffer: 20 mM HEPES (pH 7.5), 1.0M 
NaCl, 5% glycerol, and 1.0mM TCEP. The peak fractions of the intasome 
and IN were cross-linked with the indicated amount of ethylene glycol 
bis-succinimidylsuccinate (EGS) as described previously” and analysed 
by SDS-PAGE. Most cross-linked species within the intasome were larger 
than a tetramer. The highest oligomeric species observed is consistent with 
an octamer migrating at ~220 kDa. The molecular mass markers are 

in the far right lane. A NuPAGE 4-12% gradient gel with a MES-based 
SDS-PAGE running buffer was used. 
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Extended Data Figure 5 | Selenium anomalous difference Fourier peaks (orange mesh). Methionine side chains are shown in sticks. a, A view 
confirming the model. Anomalous difference Fourier maps calculated covering the octameric RSV intasome. b, c, Close-up view of a proximal (b) 
using the data collected on selenomethionine-labelled RSV intasome and distal (c) IN dimer, respectively. 

at the Se K-edge wavelength, contoured at 3.50 (blue mesh) or 5.00 
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Extended Data Figure 6 | Comparison between RSV and PFV subunit in the RSV intasome. The DNA strands are coloured as 
intasomes. a, b, Protein arrangement in the octameric RSV intasome. in Fig. 1, and the catalytic triad residues (DDE) are shown in red sticks. 
The inner and outer subunit of one proximal IN dimer is coloured in green _ d, e, Protein arrangement in the tetrameric PFV intasome?”. The colour 
and cyan, respectively, with the catalytic triad (DDE) of the inner subunit scheme follows that used for the proximal IN dimers of RSV IN ina andb. 


shown in red. The other proximal IN dimer is coloured similarly but in Arrows indicate the two IN dimers. DNA is omitted in d. e, Same view 
more pale colours. Arrows indicate the two proximal IN dimers. A distal as Fig. 3f. f, Close-up view around the active site of the inner IN subunit 
IN dimer is coloured in slate and orange. DNA is omitted in a. b, Same in the PFV intasome (PDB accession number 3080 (ref. 3)). The colour 
view as Fig. 3c. c, Close-up view around the active site of the inner IN scheme follows that in c. 
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Extended Data Figure 7 | Composite omit maps. Simulated annealing composite omit 2mF, — DF, density contoured at 1.00, shown for area within 
3.5 A from any protein or DNA atom in the final model. Various parts of the RSV intasome are shown in panels a-i. In a and b, electron densities around 


protein and DNA are coloured differently (blue and green, respectively). 
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Extended Data Figure 8 | DNA conformations in the RSV and PFV a viral DNA terminus are shown in three different view angles. Note the 
intasomes. a, b, DNA structure in the RSV (a) or PFV (b) intasome, significant deviation in overall trajectory of the target DNA, and difference 
alternatively referred to as the strand-transfer complex (STC). The PFV in the orientation of the second viral DNA molecule. The region spanning 
intasome model is PDB accession number 3080 (ref. 3). c, A comparison the two integration sites on opposing target DNA strands is 6 bp for RSV 
of DNA structures between the RSV and PFV intasomes (STCs). The and 4bp for PFV. 


integration product DNAs (RSV in cyan, PFV in red) superimposed at 
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Extended Data Figure 9 | Electron density for the central 6 bp of the target DNA. The sigma-A weighted 2mF, — DF. map contoured at 1.5¢ (a) or 
2.50 (b), overlaid with the final model for the central 6 bp region between the two integration sites. 
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Extended Data Table 1 | Data collection and refinement statistics 


Data collection 
Space group 
Cell dimensions 

a, b,c (A) 

a, By (@) 
Resolution (A) 
Rrmerge (%) 

T/ol 

CCiz 
Completeness (%) 
Redundancy 


Refinement 

Resolution (A) 

No. reflections 

Reflections for R fee 

Ryork/ Rétee 

No. atoms 
Protein 
Ligand/ion 
Water 

B-factors 
Protein 
Ligand/ion 
Water 

R.m.s deviations 
Bond lengths (A) 
Bond angles (°) 


RSV intasome 
P2, 


124.9, 157.8, 126.6 
90.0, 110.9, 90.0 

49.3 - 3.80 (3.90 — 3.80) 
14.7 (247.6) 

6.0 (0.6) 

0.994 (0.287) 

99.2 (99.3) 

5.5 (5.5) 


49.3 - 3.80 
44627 
2183 
25.4/29.4 


Statistics for the highest resolution shell are shown in parentheses. 
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TECHNOLOGY FEATURE 


DNA TAGS HELP THE 
HUNT FOR DRUGS 


Drug discovery is a daunting process that requires chemists to sift through millions of 
chemicals to find a single hit. DNA technology can dramatically speed up the search. 


BY ASHER MULLARD 


estled in a plastic box, in an ordinary 
| \ | laboratory freezer on the second floor 
of a concrete building in Waltham, 
Massachusetts, is a clear test tube that contains 
a concoction of astronomical proportions. The 
library frozen within, a collection of chemical 
compounds owned by London-based phar- 
maceutical company GlaxoSmithKline (GSK), 
contains as many as | trillion unique DNA- 
tagged molecules — ten times the number of 
stars in the Milky Way. 

This and other such libraries are helping 
pharma companies and biotechnology firms 
to quickly identify candidate drugs that can 
latch onto the proteins involved in disease, 


especially those proteins that have proved diffi- 
cult to target. They enable screening to be per- 
formed much more rapidly and cheaply than 
with conventional methods. And academic 
scientists can also use them to probe basic biol- 
ogy questions and investigate enzymes, recep- 
tors and cellular pathways. 

Drug discovery often starts with research- 
ers assembling large libraries of chemicals 
and then testing them against a target pro- 
tein. Compounds are added individually to 
wells that contain the target to see whether 
they affect its activity. This approach, known 
as high-throughput screening (HTS), is auto- 
mated using robotic equipment to test millions 
of chemicals, but it’s still laborious, expensive 
and not always successful. 


Over the past few years, medicinal chem- 
ists have been increasing the odds of finding 
potentially useful compounds by labelling 
chemical compounds with bits of barcode- 
like DNA. These DNA-encoded libraries — 
which can dwarf conventional small-molecule 
libraries — offer all sorts of advantages to drug 
discovery. For a start, rather than testing each 
compound individually, researchers can put 
all of the DNA-tagged small molecules into a 
single mixture and then introduce the target 
protein. Any compounds that bind with the 
target can be identified easily thanks to their 
DNA barcodes. 

DNA-encoded libraries were first proposed 
in 1992, ina thought experiment’ by molecular 
biologist Sydney Brenner and chemist Richard 
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Uae NORE DRUG DISCOVERY 


BUILDING BARCODES 


DNA-encoded libraries consist of billions of molecules 
tagged with identifying DNA ‘barcodes’. The biggest 
libraries are built using DNA recording, but DNA 
templating allows for finer control. 


DNA recording 


1 > A building block (BB), such as an amino acid, is 
tagged with a DNA sequence. Then a second BB is 
added, and the DNA tag is lengthened with a 
sequence that corresponds to the second BB. 
Compounds of up to four BBs can be assembled. 
At each step, the BBs and tags are added to 
mixtures of thousands of different tagged BBs to 
quickly grow libraries to vast sizes. 


BB DNA tag 
+ OOO 


2. A target protein is added to the mixture of 
billions of unique DNA-tagged molecules, some 
of which bind to the target. The DNA barcodes of 
the bound molecules are used to identify them. 


Target 
protein 
AO COCOOE. 
@ Mhbcoosocooo. 
DNA templating 


1 > Each molecule is designed as a single-stranded 
DNA template. BBs are tagged with DNA 
‘anti-barcodes’ that are complementary to 
regions on the planned molecule’s template. 


DNA template 
ENS XT INT 


Tagged BBs 


P<, 


2 > A DNA-tagged BB binds to its corresponding 
section on the template. A second DNA-tagged 
BB is added and binds to its corresponding 
template position, and the two BBs join ina 
chemical reaction. More BBs are added to 
complete the template. 


@ocoao rn > 
b@xcco00  «_ 
Nay ee. 


BA O2e6c88ssa. << 


3 > A final chemical reaction can convert a string of 
building blocks into a ring, producing barcoded 
macrocycles. 


BY Booccs0000 


Lerner, who were then at the Scripps Research 
Institute in La Jolla, California. They have been 
gaining momentum ever since. In 2007, GSK 
acquired one of the firms that pioneered these 
libraries, Praecis Pharmaceuticals in Waltham, 
for US$55 million. The drug firms Novartis 
and Roche, both in Basel, Switzerland, have 
started their own in-house DNA-encoded- 
library programmes. A burgeoning group of 
biotech companies — including X-Chem in 
Waltham; Vipergen in Copenhagen; Ensemble 
Therapeutics in Cambridge, Massachusetts; 
and Philochem in Zurich, Switzerland — has 
meanwhile built up a who's who list of industry 
and academic partners that are eager to use the 
technology. 

“People understand now that this is not a 
fad? says Robert Goodnow, executive director 
of the Chemistry Innovation Center at Astra- 
Zeneca in Boston, Massachusetts, which col- 
laborates with X-Chem. “It’s for real” 

DNA-encoded libraries will not replace 
HTS: companies have already invested heav- 
ily in HTS screening, and there are some 
compounds that cannot be synthesized using 
DNA-encoding technologies. Rather, they 
offer a complementary way to quickly, effi- 
ciently and cheaply find chemical structures 
that bind to new or historically challenging 
targets, such as ubiquitin ligases, which flag 
proteins for disposal and could be targeted in 
cancer therapy. 


BIG IS BEAUTIFUL 

GSK currently has the world’s biggest DNA- 
encoded library: it is an impressive 500,000 
times larger than the company’s 2-million- 
compound HTS library. 

There are several ways to build DNA- 
encoded libraries: the biggest ones, like GSK’s, 
are made using an approach called ‘DNA 
recording” (see ‘Building barcodes’). Chemical 
building blocks, such as amino acids, amines 
and carboxylic acids, are synthesized and then 
tagged with a unique DNA barcode through 
a chemical reaction. A second building block 
is added to the mix to make a new small mol- 
ecule, and the DNA barcode is then length- 
ened. By joining up to four blocks, chemists 
can create drug-like molecules. And because 
they have thousands of building blocks to play 
with, the number of potential combinations is 
enormous. 

Compared with conventional HTS libraries, 
for which chemists have to test each compound 
individually, DNA-encoded libraries are easier 
to maintain and use. A DNA-encoded library 
can be stored in a single test tube, whereas an 
HTS library requires robot-filled facilities 
that are large enough to store each compound 
individually. 

But the true beauty of DNA-encoded librar- 
ies, says Chris Arico-Muendel, a manager at 
GSK in Waltham, is the sheer number of 
chemical structures it is possible to synthe- 
size. The company’s drug-discovery team now 
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uses the DNA-encoded library as frequently, 
if not more frequently, than the HTS library 
for new and difficult protein targets. The 
most advanced compound to emerge from 
the company’s DNA-encoded library so far is 
GSK2256294, which blocks epoxide hydro- 
lase’, an enzyme that is involved in breaking 
down lipids. This drug candidate came out 
of GSK’s collaboration with Praecis and has 
completed first-in-human safety studies that 
may support further evaluation of its use in 
diabetes, wound healing or as a therapy for 
chronic obstructive pulmonary disease’. “We 
are pleased with how things are going with 
DNaA-encoded libraries within GSK,” says 
Arico-Muendel. 

And as more chemical building blocks are 
created, along with extra ways to attach them 
to one another, the breadth of these libraries 
will continue to expand. 

In the near future, DNA-encoded librar- 
ies will not only become bigger and broader 
but might also provide hits that can quickly 
be moved into the clinic, says X-Chem chief 
executive Richard Wagner. With conventional 
screening, medicinal chemists sometimes have 
to spend many years tweaking compounds to 
make them specific, potent and safe enough to 
enter the clinic. “This is just a game of odds,” 
says Wagner. By contrast, the large size of 
DNA-encoded libraries means that, by chance, 
some of the compounds they include will be 
more clinic-ready than others. Although the 
compounds will still require optimization, “we 
can get things that are pretty close’, he says. 

X-Chem, which holds 120 billion com- 
pounds in its DNA-encoded library, is already 
starting to see this in practice. It took just 
one year to move its most advanced candi- 
date — an autotaxin inhibitor that blocks the 
conversion of one phospholipid into another 
— from a screening hit to a clinical candi- 
date. A spin-off company of X-Chem, X-Rx 
in Wilmington, North Carolina, now plans to 
start clinical trials of the compound for fibrosis 
in 2017. Interest in X-Chem’s library is spread- 
ing across the industry: in the past five years, 
the company has forged collaborations and 
licensing agreements with several major phar- 
maceutical firms — including Roche, Astra- 
Zeneca, Bayer, Johnson & Johnson, Pfizer and 
Sanofi — as well as with a host of biotech and 
academic partners. 


MADE TO ORDER 

Other biotech firms have added an interest- 
ing twist. They use the DNA tag not only to 
identify a compound but also as a template to 
make it. David Liu, a chemist at Harvard Uni- 
versity in Cambridge, and his students devel- 
oped this ‘DNA-templated’ approach and used 
it to build a library of circular molecules called 
macrocyles*. These larger, more-stable, ring- 
shaped molecules interact with the target at 
multiple sites, boosting the specificity of the 
binding reaction. (GSK and X-Chem also have 


SOURCE: ADAPTED FROM THE SCIENTIST 28, 62-65 (2014) 


JAMES KING-HOLMES/SPL 


extensive macrocycle collections in their DNA- 
encoded libraries.) 

Liu first creates single-stranded DNA tem- 
plates that act as guides — these consist of 
several regions that are complementary to the 
DNA tags on his chemical building blocks. 
He then sequentially adds the DNA-tagged 
building blocks into a reaction vessel, relying 
on DNA base pairing to physically bring the 
tagged building blocks close enough together to 
bind to one another. A final reaction then con- 
verts the strings of building blocks into rings, 
producing macrocycles that are each tethered 
to a unique DNA barcode. 

Constructing a DNA-templated library 
involves a hefty workload because researchers 
must design a template for each molecule as 
well as tagging thousands of building blocks 
with DNA. Asa result, DNA-templated librar- 
ies are smaller than DNA-recorded ones, but 
they still eclipse HTS libraries in terms of 
size — and they have other advantages, too. 
Because scientists know at the outset what 
compounds they are producing, they can 
purify the DNA-templated libraries to remove 
compounds that are tagged inaccurately. This 
step translates into a high degree of confidence 
in the hits. By contrast, colossal DNA-recorded 
libraries may still contain wrongly tagged com- 
pounds, and thus might yield hits that will send 
researchers on wild goose chases. 

Liu’s 14,000-strong library has already led 
to a few triumphs. In 2014, his team reported 
that it had solved a problem that researchers 
had been struggling with for decades when it 
found’ a specific and 


stable small mol- “WWehave 

ecule that can block had more hits 
insulin-degrading against targets 
enzyme (IDE), which from our ‘first 
has been linked with library than 
type 2 diabetes. He we can follow 
and others have up on.” 


started to unravel the 
role of IDE in both 
health and disease, which has led to the identi- 
fication of other IDE inhibitors. Discussions are 
under way to develop these into drugs. 

Liu has also screened more than 100 other 
targets, many brought to him by academic 
collaborators who need small-molecule 
inhibitors of their pet proteins. “I never would 
have thought seven years later that this first- 
generation library would still be providing us 
with interesting biological discoveries,’ he says. 
“But it has proved to be very fruitful. We have 
had more hits against targets from our first 
library than we can follow up on” 

He is nevertheless putting the finish- 
ing touches to a second-generation, 
256,000-macrocycle library that could open 
up even more biology. Ensemble Therapeu- 
tics, which Liu founded in 2004, now has more 
than 10 million macrocycles in its library. The 
company is focusing on targets among the 
immune checkpoint proteins, which modulate 


Sydney Brenner, who conceptualized DNA- 
encoded libraries with Richard Lerner in 1992. 


the immune system, and the ubiquitin ligases. It 
has also granted a license to Novartis to develop 
one of its finds, a molecule that targets the 
inflammation-linked protein interleukin-17. 


SWEET SCREENS 

Once the library is built, the fun of identifying 
which molecules stick to a target begins. Most 
researchers rely on ‘affinity-based screening’ 
to find those compounds. For this, they engi- 
neer the protein target to include a purification 
tag. They then pass the mixture containing the 
library and target through a purification col- 
umn, using the purification tag to pull out the 
bound pairs. The last step is to read the DNA 
identifiers linked to the small molecules using 
a DNA sequencer. 

This method can yield results even with 
minute amounts of a target protein. In one 
project, remembers Arico-Muendel, aca- 
demic collaborators wanted to screen an 
unstable protein that they could produce only 
in tiny quantities. “They flew it here on dry ice 
overnight, and we immediately did the entire 
screen on it; he says. “And that actually gave 
some really good hits.” Such experiments are 
impossible with HTS, because the target pro- 
tein must be stable and abundant enough to be 
added into millions of wells before the experi- 
ment can begin. 

But affinity-based screening has its shortfalls. 
The clunky DNA tag can sometimes impede 
interaction with the target, and some potential 
candidates may be lost. But because the DNA- 
encoded libraries are so big, screeners are not 
typically too concerned with these losses. More 
problematically, small molecules and their tags 
can bind to the purification column and gener- 
ate false-positive hits. The purification tag can 
also interfere with the structure of the target 
protein, introducing confusion in the data. 

Several groups have developed solutions 
for this problem. Vipergen, a biotech firm that 
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has a DNA-templated library with 50 million 
molecules, has put its hope in a ‘binder trap 
enrichment strategy. 

Imagine, says Nils Hansen, the company’s 
chief executive, that you could freeze your 
protein-library mixture and cut it into super- 
small ice cubes. If the ice cubes are small 
enough, each will be able to contain only a 
single target protein. At this size, small mol- 
ecules that bind to the target will be consist- 
ently overrepresented in ice cubes that contain 
targets, even without a purification strategy. 
Vipergen has achieved the same effect by put- 
ting its screens into water-and-oil emulsions, 
in which minuscule water droplets stand in for 
ice cubes’. “It’s pretty cool,” says Hansen. 

Currently, screens of DNA-encoded libraries 
work best with free-floating, soluble proteins. 
But many appealing drug targets are embed- 
ded in the cell surface, making them impos- 
sible to probe with traditional affinity-based 
screening. For example, an estimated 40% 
of approved drugs target membrane-bound 
G-protein-coupled receptors, which sense 
molecules outside the cell. The technology 
for screening membrane-bound proteins is 
evolving, says Goodnow, “but is still kind of 
a challenge”. 

One way forward is to mix a DNA-encoded 
library with intact cells that overexpress a 
membrane-bound target. The small molecules 
can then bind to the targets on the surface of 
the cell. After the researchers wash away the 
unbound library, they can identify the bound 
small-molecule hits by heating up the cells and 
reading the eluted DNA tags. GSK has used 
this approach to identify potent inhibitors of 
a receptor that has been implicated in schizo- 
phrenia and disorders of the central nervous 
system’. 

X-Chem, too, is starting to see success with 
screens of membrane-bound proteins. “His- 
torically, the majority of our programmes were 
on soluble proteins. But there is shift given the 
recent data we've been able to generate with 
really difficult membrane-bound proteins,” 
says Wagner. 

With DNA-encoded libraries continuing to 
expand, and new screening approaches open- 
ing up uncharted biological space, he adds, 
“DNA- encoded libraries are set to become one 
of the pillars of discovery in the pharmaceuti- 
cal industry.’ = 


Asher Mullard is a freelance science journalist 
in Ottawa, Canada. 
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Ua SCIENCE FICTION 


BY ANDREA KRIZ 


NEWS: EXPLOSIVE END FOR 

PROJECT CHRYSALIS. An air-force 
pilot lowering herself into the cockpit with 
a thumbs-up and a smile. Nursing black eyes 
and bruises in the Earth Archives, I glued 
myself to the screen a thousand times as 
she unravelled, as glass melted into a mem- 
brane that engulfed her in blinding light. 
And I recall the curses behind my father and 
me in the Afernsi refugee ship’s mush line. 
“We never should've toyed with it. It was 
never meant for us. Charlie Bickoid — your 
curiosity killed us all? 

My father always spoke of it like some 
kind of fairy tale with a moral. Project 
Chrysalis, humankind’s attempt to 
catch up in the galactic arms race. To 
reverse-engineer an alien ‘ship’ that fell 
from the heavens. The biochemists puri- 
fied a protein-like macromolecule from 
its circuits. Dismissed it as contamination. 
An enzyme, soon discovered. A life form. 
When the Sopholid pupa exploded in 
the US Air Force Test Center, it sent out 
some kind of signal. THIS JUST IN. Two 
wings unfurling out of the inferno. 

I look up at the dogfight and 
wonder if he’s up there. If that’s his 
plane supernovaing in the violet 
skies, torn in two by the swarm. 

My father. He’s decades too old to 

be crunched up ina turret ball. He's 

a neuroscientist. It should be me. But I've no 
sense of duty, I tried to run from the draft 
and they dragged me to this, humanity’s last 
refuge of a colony, kicking and screaming. 

Like the Sopholid that Buffalo zapped, 
writhing beside me. A Mantis. Blood still 
streams off its scythes. The remains of its 
core drip into the bone-white rock at my 
feet. I wonder if it used to be human. These 
gleaming orbs — their Queen's heart throbs 
overhead — reason enough, apparently, to 
hound us to the far edge of the Galaxy, to 
plague our colonies until each and every one 
of us drools lobotomized and domesticized 
on one of their farms. “Condensed organic 
material; was all Commander Xexe would 
tell us in our briefings. The Sopholids have 
strengthened a hundredfold since devel- 
oping human-powered cores, the Afernsi 
added grimly. It was good of the Afernsi to 
rescue a couple of hundred thousand of us 
from Earth if that’s the case, I suppose. As 
the last troopship fades into the atmosphere, 


I remember the broadcasts. BREAKING 


CHRYSALIS 


Flight into danger. 


I wonder how many they’ve 
deigned to save this time. 
“Now. 
Buffalo’s shaking. His eyes dart 
between the Locusts gnawing on the 
rest of our squad and the Queen above, 
coiled around the hover-carrier of evacu- 
ees we were supposed to protect. 

“Tl fly up there and tear her a new one,’ 
I whisper, crouching by the Mantis’s head. 
“And if the Queen shuts down, the rest of the 
swarm should retreat, right?” 

“Tl shoot you if you touch that thing,” 
Buffalo mumbles. 

He wants me to remember what happened 
the last time a human wired herself into a Sop- 
holid’s brain. Biomechanical equations that’ve 
been worked out in excruciating detail since. 
The raw energy of metres of DNA uncoil- 
ing from each and every one of the trillions 
of cells in the human body simultaneously. 
Amplified by the circuits of the Sopholid in 
a way we still don’t fully understand — such 
was the topic of my PhD research that I hope 
to, no, that I will return to after the war — it 
was akin to a nuclear bomb. 

“Are you saying you want me to just stand 
here and die?” 


D> NATURE.COM He says noth- 
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the fire button. All we've got left are our 
good ol Earth rifles, whose bullets 
bounce off Sopholid exoskeletons like 
staples. 
A sound like a beetle popping 
underfoot rings out. Buffalo crumples, 
clutching his side. 
“Turbo Eater,’ he finally manages. “You 
really believe humans can fly?” 
“I don't know,’ I admit. “We can pilot 
ships and things.” 
Not just to this miserable, terraformed 
rock that the nearest star hasn't risen on for 
days. But Satellite 10388. The Galactic Uni- 
versity. My home. And the Afernsi refugee 
ship that brought me there. From Earth, but 
I don’t remember any of it. 1 remember bit- 
ing into a hydroponic tomato I stole when I 
was eight, the skin bursting, the juice drib- 
bling down my lips. The constant ache of 
hunger. The stench. My dad and I lived in 
20 square feet of space behind a radioactive 
waste container. And to conserve oxygen, 
theyd only let us open the hatch to roll out 
bodies twice a week. 

“That's how we got here, right?” 

I crawl forward. With that movement 
I affirm that I am, with all my heart, a 
believer in the Pilot Hypothesis. My 
father’s, Charlie Bickoid’s, fatal idea that 
someone built them. The Sopholids. That 
they can be controlled. I pry my fingers 
into the Mantis’s skull — cockpit — curl 
up in the wetness, let the translucent plates 
close behind me. Brain mass glows to life 
around me. Spidery characters flash before 
my eyes. But I swipe all the warnings aside. 
Tendrils feel me out. I shudder as they bur- 
row into my spine. 

The film of the canopy was cracked and 
bloody, but suddenly I can see through it, 
somehow. I clench my fist and the scythe of 
the Mantis moves with me. And the Queen 
turns to look down at me at last. We thought 
their eyes were blank, compound, insec- 
toid. But suddenly I can see all the colours 
of another rainbow in there. A rainbow I 
couldn't even have begun to comprehend 
before. They’re calling me. She’s calling me 
to join her. 

“Turbo Eater” The wind whistles in Buffa- 
los mouth because I'd pulled the trigger first, 
Id shot him hours and hours ago. “Fly.” m 


Andrea Kriz is a PhD student in biological 
and biomedical sciences at Harvard 
University. She is the winner of the 2015 Ilona 
Karmel Prize for Writing Science Fiction. 


ILLUSTRATION BY JACEY 


